Home > OS >  Summing data from several csv files
Summing data from several csv files

Time:05-05

I am a novice programmer who got a homework consisting of something like this:

  • the user chooses two quarters, which will be used to calculate the sum from some data between the quarters, and thus also the years (the following years will be chosen: 2016, 2017, 2018, 2019). The task does not seem to be some kind of mega difficult, however, I do not know some interesting way to write the code when the user selects, for example, the first quarter of 2016 and the fourth quarter of 2019). For the task I was given four files: 2016.csv, 2017.csv, 2018.csv, 2019.csv.

What advice could you guys give to assist an inexperienced colleague? :D

EDIT: Thus:

  1. the user selects the first quarter of a given year, then the second quarter of a given year
  2. the program sums the data; if the user selects the 1st quarter of 2016 and the 3rd quarter of 2018, then the program must open the 2016.csv, 2017.csv, and 2018.csv files to count the required data
  3. divide it by the number of days occurring in that period (let's assume that for each quarter it will be a constant number - 30 days)
  4. outputting:
  • the total sum of the period specified by the user,
  • the number of days in this period
  • printing the average number of data per day

CodePudding user response:

You have two ways to do it:

  1. Use databases based on the relational model of data (SQLite, MySQL, PostgreSQL, MS SQL Server, Oracle) for this case SQLite the most simple than other databases.
  • You need to create table with columns that you have in csv
  • Add column date_created date
  • Populate table with data from csv files
  • Create query for summing data: select sum(*) from table where date_crated between '2016-06-01' and '2016-09-01';
  1. Working with files:
  • Write function for read csv data, use csv.DictReader
  • Write function for summing all line without conditions
  • Write function for summing with checking quarter conditions

CodePudding user response:

Ok, i see that amount of data is something what makes exercise hard for you. Best option is not working on simple text-data, use pandas to make it easier.

import pandas as pd

Here is pandas documentation.

We don't know what is inside your .csv files and how data is presented. You can find here how use delimeters, separators and divide data inside columns. You can also get headers and sort data by them (for example sort by quarters).

  • Related