Insert pandas dataframe into Mongodb
Insert pandas dataframe into Mongodb
Pandas is most commonly used open-source Python Library for data manipulation, it provides high-performance data manipulation and analysis using its powerful data structures.
Here we use pandas library to insert dataframe into Mongodb. I am taking Yahoo finance library to get dataset for a ticker and save that data into Mongo.
First, we import all the necessary libraries
from pandas_datareader import data as pdr
import pandas as pd
from pymongo import MongoClient
import yfinance as yf
yf.pdr_override()
We get data from yahoo finance
# download dataframe
data1 = pdr.get_data_yahoo("SPY", start="2017-01-01", end="2017-01-15")
Now, we make a connection to Mongodb.
Here I am connecting to local MongoDb Server on port 27018 and then creating a database with the name `finance`. MongoDB has a collection(table) and I name it as `mycollection`.
#Step 1: Connect to MongoDB - Note: Change connection string as needed
myclient = MongoClient("mongodb://localhost:27017/")
mydb = myclient["finance"]
mycol = mydb["mycollection"]
Now that you have the data (data1), you can insert into the MongoDB database. But before it, you have to do convert the data frame into a dictionary. The other thing is that the Date column is set as Index of the Dataframe, therefore you have to reset the index before inserting.
# Step 2: Insert Data into DB
data1.reset_index(inplace=True) # Reset Index
data_dict = data1.to_dict("records") # Convert to dictionary
mycol.insert_one({"index":"SPY","data":data_dict}) # inesrt into DB
From the above code, we have successfully saved data into MongoDB. You can login to your MongoDB UI and can check the data, it appears like below:
Now, the question is how to load the dataframe from MongoDB to pandas dataframe?
We get data from MongoDB using the find_one(), then converting the data into Dataframe using pandas. After that, I set the “Date” as the index and display it on the screen.
# Step 3: Get data from DB
data_from_db = mycol.find_one({"index":"SPY"})
output_dataframe = pd.DataFrame(data_from_db["data"])
output_dataframe.set_index("Date",inplace=True)
print(output_dataframe)
The output screen looks like below:
Your blog is in a convincing manner, thanks for sharing such an information with lots of your effort and time
ReplyDeletemongodb online training India
mongodb online training Hyderabad
Informative blog. Thanks for sharing.
ReplyDeleteMean Stack Online Training
Mean Stack Training in Hyderabad
Nice tips. Very innovative... Your post shows all your effort and great experience towards your work Your Information is Great if mastered very well. PHP Training in Chennai | Certification | Online Training Course | Machine Learning Training in Chennai | Certification | Online Training Course | iOT Training in Chennai | Certification | Online Training Course | Blockchain Training in Chennai | Certification | Online Training Course | Open Stack Training in Chennai |
ReplyDeleteCertification | Online Training Course