Home > other >  The JSON object must be str, bytes or bytearray, not list
The JSON object must be str, bytes or bytearray, not list

Time:10-13

I am trying to access JSON data but getting the above error. My code is:

with open(filepath "decompressed_twitter_lot1file1.txt", 'rb') as fh:
    for line in fh:
        object = json.loads(line)
        urls_in_tweet = object['entities']['urls']
        domains_in_tweet = []
        print(urls_in_tweet)
        for url in urls_in_tweet:
            for key, value in url.items():
                print(key,value)
                domain = tldextract.extract(value).registered_domain
                print("domain")

My output:

[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[{
  'display_url': 'msnbc.com/rachel-maddow/…',
  'indices': [67, 90],
  'expanded_url': '//www.msnbc.com/rachel-maddow/watch/trump-admin-coverage-maxim-watch-what-they-do-not-what-they-say-66934341943',
  'url': '//t/zHmMchTCIf'
}]
display_url msnbc.com / rachel - maddow / …
  domain
indices[67, 90]

After this, I get this error.I dont understand why after indices key it is not printing anything.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-90-cb728568a4f1> in <module>
      8             for key, value in url.items():
      9                 print(key,value)
---> 10                 domain = tldextract.extract(value).registered_domain
     11                 print("domain")

~/.local/lib/python3.8/site-packages/tldextract/tldextract.py in extract(url, include_psl_private_domains)
    294     url, include_psl_private_domains=False
    295 ):  # pylint: disable=missing-function-docstring
--> 296     return TLD_EXTRACTOR(url, include_psl_private_domains=include_psl_private_domains)
    297 
    298 

~/.local/lib/python3.8/site-packages/tldextract/tldextract.py in __call__(self, url, include_psl_private_domains)
    214 
    215         netloc = (
--> 216             SCHEME_RE.sub("", url)
    217             .partition("/")[0]
    218             .partition("?")[0]

TypeError: expected string or bytes-like object

This is data is small part of Twitter API data. How can I access every key-value pair of this JSON data and load value in domain list?

CodePudding user response:

json.loads expects a str hence the error

If you want to get the key-value pairs you can do this:

fs = [{'display_url': 'eonli.ne/33XF5V1', 'indices': [90, 113], 'expanded_url': 'eonli.ne/33XF5V1', 'url': 't.co/flhUdZcUzB'}]

for k,v in fs[0].items():
  print(f"{k}, {v}")

fs[0] is a dictionary, get the items with items()

There is not need for json.loads here

  • Related