Skip to main content

Command Palette

Search for a command to run...

Apache Beam: Keys

Get the Keys from a Key-Value Pair

Updated
1 min read
N

Los Angeles

Overview

What if you only care about the keys of your PCollection and not necessarily the values? Maybe you have KV pairs for words and their counts in a string (KV<String, Integer>) and want to use the list of words in another PCollection.

You should use the Keys transform!

When You Should Use the Keys Transform

When you only want the keys in from a KV PCollection.

How to Use the Keys Transform

Just apply the builtin Transform to a PCollection of KVs. The output PCollection will have the type of the Keys in the KV and be flattened for all input elements.

Example: Extract Keys from Word Counts

    // Create key/value pairs
    PCollection<KV<String, Integer>> pairs =
        pipeline.apply(
            Create.of(KV.of("one", 1), KV.of("two", 2), KV.of("three", 3), KV.of("four", 4)));
    // Returns only the values of the collection: PCollection<KV<K,V>> ->
    // PCollection<V>
    PCollection<String> valuesOnly = pairs.apply(Keys.create());

Conclusion

Check out other useful transforms from the official Apache Beam documentation.

Apache Beam and Google Cloud Dataflow

Part 10 of 12

Dive into the world of scalable data processing with our comprehensive series on Apache Beam and Google Cloud Dataflow.

Up next

Apache Beam: Filter

Use Apache Beam's built-in Filter Transform to Simplify your Pipelines

More from this blog

Nikhil Rao's Blog

18 posts

For the love of Data