Home > front end >  Add data to specific column using writerows, lambda
Add data to specific column using writerows, lambda

Time:12-23

I've iterated through JSON objects returned to me via an API query, and can get the data (obj_key1) into my target_csv, but the data is showing up in the first column, not the data column, which I am trying to specify.

Example JSON:

[{'obj_key': 0.5,
  'obj_key1': 4000,
  'obj_key2': '2020-05-01',
  'obj_key3': 0.0,
  'obj_key4': 'US',
  'obj_key5': 50000,
  'obj_key6': 11,
  'obj_key7': 0.0,
  'obj_key8': 20000,
  'obj_key9': 0.01,
  'obj_key10': '6786547',
  'obj_key11': None,
  'obj_key12': 12},
 {'obj_key': 0.4,
  'obj_key1': 5000,
  'obj_key2': '2020-05-02',
  'obj_key3': 0.0,
  'obj_key4': 'US',
  'obj_key5': 657435,
  'obj_key6': 112,
  'obj_key7': 0.0,
  'obj_key8': 678548,
  'obj_key9': 0.032,
  'obj_key10': '6786547',
  'obj_key11': None,
  'obj_key12': 13},...]

Code:

import csv
import json
import requests

token = {token}

with open('../folder/file.csv') as src, open('target_file.csv', 'w', newline='') as tgt:
    reader = csv.reader(src)
    writer = csv.writer(tgt)

    columns = next(reader)

    writer.writerow(columns)

    headers = {
        'Authorization': token,
        'Content-Type': 'application/json; charset=utf-8',
    }

    for id, name, date, data in reader:

        response = requests.get(f'https://{base_url}/api/endpoint?start_date=2021-05-01&end_date=2021-05-15'   '&id='   str(id),
            headers=headers
        )

        response_data = response.json()
        for i in range(len(response_data)):
            data = writer.writerows(map(lambda x: [response_data[i]['obj_key1']], range(1)))

Expected CSV Output:

id name date data
6786547 Name 2020-05-01 4000
6786547 Name 2020-05-02 5000

CodePudding user response:

I'm going recommend something that looks very different, but should get the job done all the same.

The big idea for me is to split the process up into distinct chunks—it'll be much easier for everyone to understand.

Read first

I've chosen to save all the API reads into a list of dicts that will be written-out with the CSV DictWriter in the next part.

import csv
import requests

path = f'https://{base_url}/api/endpoint?start_date=2021-05-01&end_date=2021-05-15'
token = {token}
headers = {
    'Authorization': token,
    'Content-Type': 'application/json; charset=utf-8',
}

final_rows = []
with open('../folder/file.csv') as src:
    reader = csv.reader(src)
    next(reader)  # discard this header, will be created from dict row

    for my_id, name, date in reader:  # omitted `data` from input
        response = requests.get(path   f'&id={my_id}', headers=headers)

        # Assuming `response.json()` looks like the sample JSON you provided
        for json_data in response.json():
            final_row = {
                'id': my_id,
                'name': name,
                'date': date,
                'data': json_data['obj_key1']
            }

        final_rows.append(final_row)

I mocked up final_rows, which looks like:

final_rows = [
    {
        'id': '6786547',
        'name': 'Name',
        'date': '2020-05-01',
        'data': '4000'
    },
    {
        'id': '6786547',
        'name': 'Name',
        'date': '2020-05-02',
        'data': '5000'
    }
]

Then, write

Now, final_rows will almost just write itself:

with open('target_file.csv', 'w', newline='') as tgt:
    writer = csv.DictWriter(tgt, fieldnames=final_rows[0].keys())
    writer.writeheader()
    writer.writerows(final_rows)

and I get:

id,name,date,data
6786547,Name,2020-05-01,4000
6786547,Name,2020-05-02,5000

CodePudding user response:

Here's how to implement what I suggested in my comment about calling writerow() multiple times and passing it all four columns (id, name, date, and data). You might be able to optimize it to create the rows first and then make one call to writerows() instead, but it's doubtful to me whether it would be worth the effort.

Note it seems odd to me that you're ignoring the date in 'obj_key2' of each object of the list in the response and using the one in CSV file being read — but that seems to be what you said in your question.

import csv
import json
import requests

token = {token}

with open('../folder/file.csv', 'r', newline='') as src, \
     open('target_file.csv', 'w', newline='') as tgt:
    reader = csv.reader(src)
    writer = csv.writer(tgt)

    writer.writerow(next(reader)   ['data'])  # Copy header row and add column.

    headers = {'Authorization': token,                           
               'Content-Type': 'application/json; charset=utf-8'}

    for id, name, date, data in reader:
        response = requests.get(
            f'https://{base_url}/api/endpoint?start_date=2021-05-01&end_date=2021-05-15' 
            f'&id={id}', headers=headers)

        for obj in response.json():
            writer.writerow([id, name, date, obj['obj_key1']])
  • Related