Want to use graphs to find out in a few days, every day is the highest temperature record; Date code idea is very simple, as the key, time and temperature as the value, then the values in the reduce method first traversal again to find the highest temperature Max, traverse the values again, find the record of the highest temperature, but unfortunately, the program can output, but the output file without content, don't know is what reason, people wonder if cannot use two traversal,
In the input file format is: date/time temperature
Example:
The 2017-06-23 euro 12
The 12 to 25 2017-06-23
Java code is as follows:
import. IO. IOException; The import java.net.URI; Import org, apache hadoop. Conf. Configuration; Import org, apache hadoop. Fs. FileSystem; Import org, apache hadoop. Fs. The Path; Import org, apache hadoop. IO. Text; Import org, apache hadoop. Graphs. The Job; Import org, apache hadoop. Graphs. Mapper; Import org, apache hadoop. Graphs. Reducer; Import org, apache hadoop. Graphs. Lib. Input. FileInputFormat; Import org, apache hadoop. Graphs. Lib. Output. FileOutputFormat; Public class MaxTemperature { Public static class MaxTemperatureMap extends Mapper { @ Override Public void map (Object key, the Text value, Context Context) throws IOException, InterruptedException { String [] STR=value. The toString (). The split (" "); Context. Write (new Text (STR [0]), the new Text (STR + STR [1] [2])); } } Public static class MaxTemperatureReduce extends Reducer { @ Override Public void the reduce (Text key Iterable Values, the Context Context) throws IOException, InterruptedException { //to find the highest temperature Max Int Max=Integer. MIN_VALUE; For (Text value: values) { Max=Math. Max (Max, Integer parseInt (value. The toString (). The substring (2))); } //the context. Write (key, new Text (String) the valueOf (Max))); //to find the highest temperature to Max's record, the output to a file For (Text value: values) { If (Max==Integer. ParseInt (value. The toString (). The substring (2))) { Context. Write (key, value); } } //the context. Write (new Text (key. The toString () + ":" + value. The toString (). The substring (0, 2)), new IntWritable (Max)); } } Public static void main (String [] args) throws the Exception { //set the HDFS String ipName="127.0.0.1"; String HDFS="HDFS://" + ipName + ": 9000"; The Configuration conf=new Configuration (); Job Job=Job. GetInstance (conf); String jobName="MaxTemperature"; RemoveOutput (conf, HDFS); Job. SetJarByClass (MaxTemperature. Class); Job. SetMapperClass (MaxTemperatureMap. Class); Job. SetMapOutputKeyClass (Text. Class); Job. SetMapOutputValueClass (Text. Class); //job. SetCombinerClass (MaxTemperatureCombine. Class); Job. SetReducerClass (MaxTemperatureReduce. Class); Job. SetOutputKeyClass (Text. Class); Job. SetOutputValueClass (Text. Class); //3. Set the input and the output path String dataDir="/workspace/flowStatistics/date_data";//data directory String outputDir="/workspace/flowStatistics/maxTemperature";//the output directory The Path inPath=new Path (HDFS + dataDir); The Path outPath=new Path (HDFS + outputDir); FileInputFormat. AddInputPath (job, inPath); FileOutputFormat. SetOutputPath (job, outPath); System. Out.println (" Job: "+ jobName +" is running... "); {if (job. WaitForCompletion (true)) System. Out.println (" success!" ); System.exit(0); } else { System. Out.println (" failed!" ); System. The exit (1); } } //this method is to avoid the error because of the output file already exists, direct delete the last output folder, run the Private static void removeOutput (Configuration conf, String ipPre) Throws IOException { String outputPath=ipPre + "/workspace/flowStatistics/maxTemperature"; FileSystem fs=FileSystem. Get (URI. The create (outputPath), the conf); Path the Path=new Path (outputPath); If (fs) exists (path)) { Fs. DeleteOnExit (path); } Fs. The close (); } } CodePudding user response:
Check in counter redouce output record number is zero? If yes, have to check the logic,
CodePudding user response:
You reduce, obtain Iterable
Values, At the time of loop iteration, obtain the maximum way right? The value is time + temperature string. CodePudding user response:
reference 1st floor zgycsmb response: check in counter redouce output record number is zero? If yes, have to check the logic, Oh, I went to look at CodePudding user response:
refer to the second floor Zhang Boyi response: you reduce, obtain Iterable Values, At the time of loop iteration, obtain the maximum way right? String value is time + temperature. Behind me is the subString (2); The temperature of this quantity now CodePudding user response:
Excuse me, can you solve it Can traverse twice