Apache Beam: Values Transform
Los Angeles
Overview
What if you only care about the values of your PCollection and not necessarily the keys? Maybe you have KV pairs for words and their counts in a string (KV<String, Integer>) and want to use the counts in another PCollection.
You should use the Values transform!
When You Should Use the Values Transform
When you only want the values in from a KV PCollection.
How to Use the Keys Transform
Just apply the built-in Transform to a PCollection of KVs. The output PCollection will have the type of the values in the KV and be flattened for all input elements.
Example: Extract Values from Word Counts
// Create key/value pairs
PCollection<KV<String, Integer>> pairs =
pipeline.apply(
Create.of(KV.of("one", 1), KV.of("two", 2), KV.of("three", 3), KV.of("four", 4)));
// Returns only the values of the collection: PCollection<KV<K,V>> ->
// PCollection<V>
PCollection<Integer> valuesOnly = pairs.apply(Values.create());
// results in
// 3
// 1
// ... etc.
Conclusion
Check out other useful transforms from the official Apache Beam documentation.