Apache Beam: Values Transform

·

1 min read

Overview

What if you only care about the values of your PCollection and not necessarily the keys? Maybe you have KV pairs for words and their counts in a string (KV<String, Integer>) and want to use the counts in another PCollection.

You should use the Values transform!

When You Should Use the Values Transform

When you only want the values in from a KV PCollection.

How to Use the Keys Transform

Just apply the built-in Transform to a PCollection of KVs. The output PCollection will have the type of the values in the KV and be flattened for all input elements.

Example: Extract Values from Word Counts

    // Create key/value pairs
    PCollection<KV<String, Integer>> pairs =
        pipeline.apply(
            Create.of(KV.of("one", 1), KV.of("two", 2), KV.of("three", 3), KV.of("four", 4)));
    // Returns only the values of the collection: PCollection<KV<K,V>> ->
    // PCollection<V>
    PCollection<Integer> valuesOnly = pairs.apply(Values.create());

// results in
// 3
// 1
// ... etc.

Conclusion

Check out other useful transforms from the official Apache Beam documentation.