Home > Enterprise >  How to run Google App Engine app indefinitely
How to run Google App Engine app indefinitely

Time:04-11

I successfully deployed a twitter screenshot bot on Google App Engine. This is my first time deploying.

First thing I noticed was that the app didn't start running until I clicked the link. When I did, the app worked successfully (replied to tweets with screenshots) as long as the tab was loading and open. When I closed the tab, the bot stopped working.

Also, in the cloud shell log, I saw:

Handling signal: term  
[INFO] Worker exiting (pid 18)

This behaviour surprises me as I expect it to keep running on google server indefinitely. My bot works by streaming with Twitter api. Also the "worker exiting" line above surprises me.

Here is the relevant code:

def get_stream(set):
    global servecount 
    with requests.get(f"https://api.twitter.com/2/tweets/search/stream?tweet.fields=id,author_id&user.fields=id,username&expansions=author_id,referenced_tweets.id", auth=bearer_oauth, stream=True) as response:
        print(response.status_code)
        if response.status_code == 429:
            print(f"returned code 429, waiting for 60 seconds to try again")
            print(response.text)
            time.sleep(60)
            return
        if response.status_code != 200:
            raise Exception(
                f"Cannot get stream (HTTP {response.status_code}): {response.text}"
                        )
        for response_line in response.iter_lines():
            if response_line:
                json_response = json.loads(response_line)
                print(json.dumps(json_response, indent=4))

                if json_response['data']['referenced_tweets'][0]['type'] != "replied_to":
                    print(f"that was a {json_response['data']['referenced_tweets'][0]['type']} tweet not a reply. Moving on.")
                    continue
                uname = json_response['includes']['users'][0]['username']
                tid = json_response['data']['id']
                reply_tid = json_response['includes']['tweets'][0]['id']
                or_uid = json_response['includes']['tweets'][0]['author_id']
                print(uname, tid, reply_tid, or_uid)
                followers = api.get_follower_ids(user_id='1509540822815055881')
                uid = int(json_response['data']['author_id'])
                if uid not in followers: 
                    try:    
                        client.create_tweet(text=f"{uname}, you need to follow me first :)\nPlease follow and retry. \n\n\nIf there is a problem, please speak with my creator, @JoIyke_", in_reply_to_tweet_id=tid, media_ids=[mid])
                    except:
                        print("tweet failed")
                        continue
                mid = getmedia(uname, reply_tid)
                #try:   
                client.create_tweet(text=f"{uname}, here is your screenshot: \n\n\nIf there is a problem, please speak with my creator, @JoIyke_", in_reply_to_tweet_id=tid, media_ids=[mid])
                    #print(f"served {servecount} users with screenshot")
                    #servecount  = 1
                #except:
                #   print("tweet failed")
                editlogger()

def main():
    servecount, tries = 1, 1
    rules = get_rules()
    delete = delete_all_rules(rules)
    set = set_rules(delete)
    while True:
        print(f"starting try: {tries}")
        get_stream(set)
        tries  = 1

If this is important, my app.yaml file has only one line:

runtime: python38

and I deployed the app from cloud shell with gcloud app deploy app.yaml

What can I do? I have searched and can't seem to find a solution. Also, this is my first time deploying an app sucessfully. Thank you.

CodePudding user response:

App Engine works on demand, i.e, only will be up if there are requests to the app (this is why when you click on the URL the app works). As well you can set 1 instance to be "running all the time" (min_instances) it will be an anti-pattern for what you want to accomplish and App Engine. Please read How Instances are Managed

Looking at your code you're pulling data every minute from Twitter, so the best option for you is using Cloud Scheduler Cloud Functions.

Cloud Scheduler will call your Function and it will check if there is data to process, if not the process is terminated. This will help you to save costs because instead of have something running all the time, the function will only work the needed time.

On the other hand I'm not an expert with the Twitter API, but if there is a way that instead of pulling data from Twitter and Twitter calls directly your function it will be better since you can optimize your costs and the function will only run when there is data to process instead of checking every n minutes.

As an advice, first review all the options you have in GCP or the provider you'll use, then choose the best one for your use case. Just selecting one that works with your programming language does not necessarily will work as you expect like in this case.

CodePudding user response:

  1. Google App Engine works on demand i.e. when it receives an HTTP(s) request.

  2. Neither Warmup requests nor min_instances > 0 will meet your needs. A warmup tries to 'start up' an instance before your requests come in. A min_instance > 0 simply says not to kill the instance but you still need an http request to invoke the service (which is what you did by opening a browser tab and entering your Apps url).

  3. You may ask - since you've 'started up' the instance by opening a browser tab, why doesn't it keep running afterwards? The answer is that every request to a Google App Engine (Standard) app must complete within 1 - 10 minutes (depending on the type of scaling) your App is using (see documentation). For Google App Engine Flexible, the timeout goes up to 60 minutes. This tells you that your service will timeout after at most 10 minutes on GAE standard or 60 minutes on GAE Flexible.

  4. I think the best solution for you on GCP is to use Google Compute Engine (GCE). Spin up a virtual server (pick the lowest configuration so you can stick within the free tier). If you use GCE, it means you spin up a Virtual Machine (VM), deploy your code to it and kick off your code. Your code then runs continuously.

  • Related