Tutorial To Create A Data Science Agent: A Code Implementation Using Gemini-2.0-flash-lite Model Through Google Api, Google.generativeai, Pandas And Ipython.display For Interactive Data Analysis

Trending 3 weeks ago
ARTICLE AD BOX

In this tutorial, we show nan integration of Python’s robust information manipulation room Pandas pinch Google Cloud’s precocious generative capabilities done nan google.generativeai package and nan Gemini Pro model. By mounting up nan situation pinch nan basal libraries, configuring nan Google Cloud API key, and leveraging nan IPython show functionalities, nan codification provides a step-by-step attack to building a information subject supplier analyzing a sample income dataset. The illustration shows really to person a DataFrame into markdown format and past usage earthy connection queries to make insights astir nan data, highlighting nan imaginable of combining accepted information study devices pinch modern AI-driven methods.

!pip instal pandas google-generativeai --quiet

First, we instal nan Pandas and google-generativeai libraries quietly, mounting up nan situation for information manipulation and AI-powered analysis.

import pandas arsenic pd import google.generativeai arsenic genai from IPython.display import Markdown

We import Pandas for information manipulation, google.generativeai for accessing Google’s generative AI capabilities, and Markdown from IPython.display to render markdown-formatted outputs.

GOOGLE_API_KEY = "Use Your API Key Here" genai.configure(api_key=GOOGLE_API_KEY) model = genai.GenerativeModel('gemini-2.0-flash-lite')

We delegate a placeholder API key, configure nan google.generativeai customer pinch it, and initialize nan ‘gemini-2.0-flash-lite’ GenerativeModel for generating content.

data = {'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Headphones'], 'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics'], 'Region': ['North', 'South', 'East', 'West', 'North', 'South'], 'Units Sold': [150, 200, 180, 120, 90, 250], 'Price': [1200, 25, 75, 300, 50, 100]} sales_df = pd.DataFrame(data) print("Sample Sales Data:") print(sales_df) print("-" * 30)

Here, we create a Pandas DataFrame named sales_df containing sample income information for various products, and past people nan DataFrame followed by a separator statement to visually separate nan output.

def ask_gemini_about_data(dataframe, query): """ Asks nan Gemini Pro exemplary a mobility astir nan fixed Pandas DataFrame. Args: dataframe: The Pandas DataFrame to analyze. query: The earthy connection mobility astir nan DataFrame. Returns: The consequence from nan Gemini Pro exemplary arsenic a string. """ punctual = f"""You are a information study agent. Analyze nan pursuing pandas DataFrame and reply nan question. DataFrame: ``` {dataframe.to_markdown(index=False)} ``` Question: {query} Answer: """ consequence = model.generate_content(prompt) return response.text

Here, we conception a markdown-formatted punctual from a Pandas DataFrame and a earthy connection query, past usage nan Gemini Pro exemplary to make and return an analytical response.

# Query 1: What is nan full number of units sold crossed each products? query1 = "What is nan full number of units sold crossed each products?" response1 = ask_gemini_about_data(sales_df, query1) print(f"Question 1: {query1}") print(f"Answer 1:\n{response1}") print("-" * 30)
Query 1 Output
# Query 2: Which merchandise had nan highest number of units sold? query2 = "Which merchandise had nan highest number of units sold?" response2 = ask_gemini_about_data(sales_df, query2) print(f"Question 2: {query2}") print(f"Answer 2:\n{response2}") print("-" * 30)
Query 2 Output
# Query 3: What is nan mean value of nan products? query3 = "What is nan mean value of nan products?" response3 = ask_gemini_about_data(sales_df, query3) print(f"Question 3: {query3}") print(f"Answer 3:\n{response3}") print("-" * 30)
Query 3 Output
# Query 4: Show maine nan products sold successful nan 'North' region. query4 = "Show maine nan products sold successful nan 'North' region." response4 = ask_gemini_about_data(sales_df, query4) print(f"Question 4: {query4}") print(f"Answer 4:\n{response4}") print("-" * 30)
Query 4 Output
# Query 5. More analyzable query: Calculate nan full gross for each product. query5 = "Calculate nan full gross (Units Sold * Price) for each merchandise and coming it successful a table." response5 = ask_gemini_about_data(sales_df, query5) print(f"Question 5: {query5}") print(f"Answer 5:\n{response5}") print("-" * 30)
Query 5 Output

In conclusion, nan tutorial successfully illustrates really nan synergy betwixt Pandas, nan google.generativeai package, and nan Gemini Pro exemplary tin toggle shape information study tasks into a much interactive and insightful process. The attack simplifies querying and interpreting information and opens up avenues for precocious usage cases specified arsenic information cleaning, characteristic engineering, and exploratory information analysis. By harnessing these state-of-the-art devices wrong nan acquainted Python ecosystem, information scientists tin heighten their productivity and innovation, making it easier to deduce meaningful insights from analyzable datasets.


Here is nan Colab Notebook. Also, don’t hide to travel america on Twitter and subordinate our Telegram Channel and LinkedIn Group. Don’t Forget to subordinate our 85k+ ML SubReddit.

Asif Razzaq is nan CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing nan imaginable of Artificial Intelligence for societal good. His astir caller endeavor is nan motorboat of an Artificial Intelligence Media Platform, Marktechpost, which stands retired for its in-depth sum of instrumentality learning and heavy learning news that is some technically sound and easy understandable by a wide audience. The level boasts of complete 2 cardinal monthly views, illustrating its fame among audiences.

More