Home > Blockchain >  sqlite3 Date Comparison from a string column
sqlite3 Date Comparison from a string column

Time:11-12

im using sqlite3 to store data from a pandas dataframe

i Scrape data from Twitter and i want it to be every hour

and to do that i want to filter the date in the database the date format is like this :

 2021-11-11 08:07:33 00:00

the query im using :

cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at > " li " ")

li is a variable last inserted date before filling the db again

cur.execute("SELECT tweet_created_at FROM tweets_b_db ORDER BY tweet_created_at DESC LIMIT 1")
    li = cur.fetchone()

What it returns :

can only concatenate str (not "tuple") to str

my code :

import tweepy
import time
import datetime
import pandas as pd
import sqlite3

con = sqlite3.connect('tweetScaping.db')
cur = con.cursor()
consumer_key = "**********************"
consumer_secret = "****"
access_token = "****-*****"
access_token_secret = "***************"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)

# using tweepy to search for the keyword Bouygues telecom
text_query = 'bouygues telecom OR @bouyguestelecom OR #Bouygues'
count = 200




try:
    # Creation of query method using parameters
    tweets = tweepy.Cursor(api.search_tweets, q=text_query   " -filter:retweets").items(count)

    # Pulling information from tweets iterable object
    tweets_list = [[tweet.created_at, tweet.id, tweet.text] for tweet in tweets]

    # Creation of dataframe from tweets list
    # Add or remove columns as you remove tweet information
    # tweets_df = pd.DataFrame(columns=['tweet_created_at', 'tweet_id', 'tweet_text'])
    tweets_df = pd.DataFrame(tweets_list)
    tweets_df.columns = ['tweet_created_at', 'tweet_id', 'tweet_text']


    #last inserted
    cur.execute("SELECT tweet_created_at FROM tweets_b_db ORDER BY tweet_created_at DESC LIMIT 1")
    li = cur.fetchone()


    # to insert results to database (sqlite3)
    tweets_df.to_sql(name='tweets_b_db', con=con, if_exists='replace')


    # to show table content
    cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at > " li " ")
    print(cur.fetchall())

    

except BaseException as e:
    print('failed on_status,', str(e))
    time.sleep(3) 

Update : using :

cur.execute("SELECT tweet_created_at FROM tweets_b_db ORDER BY tweet_created_at DESC LIMIT 1")
data = cur.fetchone()
data = data[0]
cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at >= Datetime('{data}')")
print(cur.fetchall())

returns nothing :

[]

would be very helpful if someone can guide me to the right direction

CodePudding user response:

fetchone() returns a tuple that contains all of your requested columns in the query. in your case, there's a single column (tweet_created_at), so a single element (value of tweet_created_at) will be present in your tuple, which can be accessed at index 0.

li = cur.fetchone()
li = li[0]

CodePudding user response:

If you use an f-string for the sql statement:

cur.execute(f"SELECT * FROM tweets_b_db  WHERE tweet_created_at >= Datetime('{data}')")

I'm sure your code will work.

But, the recommended way to pass parameters is with ? placeholders:

cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at >= Datetime(?)", (data,))

Also, if data has the correct datetime format yyyy-mm-dd hh:MM:ss then you don't need the function DATETIME():

cur.execute("SELECT * FROM tweets_b_db  WHERE tweet_created_at >= ?", (data,))
  • Related