I am having some issues inserting into MongoDB via FastAPI.
The below code works as expected. Notice how the response
variable has not been used in response_to_mongo()
.
The model
is an sklearn ElasticNet model.
app = FastAPI()
def response_to_mongo(r: dict):
client = pymongo.MongoClient("mongodb://mongo:27017")
db = client["models"]
model_collection = db["example-model"]
model_collection.insert_one(r)
@app.post("/predict")
async def predict_model(features: List[float]):
prediction = model.predict(
pd.DataFrame(
[features],
columns=model.feature_names_in_,
)
)
response = {"predictions": prediction.tolist()}
response_to_mongo(
{"predictions": prediction.tolist()},
)
return response
However when I write predict_model()
like this and pass the response
variable to response_to_mongo()
:
@app.post("/predict")
async def predict_model(features: List[float]):
prediction = model.predict(
pd.DataFrame(
[features],
columns=model.feature_names_in_,
)
)
response = {"predictions": prediction.tolist()}
response_to_mongo(
response,
)
return response
I get an error stating that:
TypeError: 'ObjectId' object is not iterable
From my reading, it seems that this is due to BSON/JSON issues between FastAPI and Mongo. However, why does it work in the first case when I do not use a variable? Is this due to the asynchronous nature of FastAPI?
CodePudding user response:
As per the documentation:
When a document is inserted a special key,
"_id"
, is automatically added if the document doesn’t already contain an"_id"
key. The value of"_id"
must be unique across the collection.insert_one()
returns an instance of InsertOneResult. For more information on "_id", see the documentation on _id.
Thus, in the second case, when you pass the dictionary to the insert_one()
function, Pymongo will add to your dictionary the unique identifier (i.e., ObjectId
) necessary to retrieve the data from the database; and hence, when returning the response from the endpoint, the ObjectId
fails getting serialized (as, by default, FastAPI serializes the data using the jsonable_encoder
and returns a JSONResponse
).
Solution 1
Remove the "_id"
key from the response
dictionary before returning it (see here on how to remove a key from a dict):
response.pop('_id', None)
Solution 2
Dump the loaded BSON
to valid JSON
string and then reload it as dict, as described here and here.
from bson import json_util
import json
response = json.loads(json_util.dumps(response))
Solution 3
Define a custom JSONEncoder
, as described here:
import json
from bson import ObjectId
class JSONEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, ObjectId):
return str(o)
return json.JSONEncoder.default(self, o)
response = JSONEncoder().encode(response)