GStars
    sinaptik-ai

    sinaptik-ai/pandas-ai

    Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

    ai
    analytics
    database
    llm
    csv
    data
    data-analysis
    data-science
    data-visualization
    datalake
    gpt-4
    pandas
    sql
    text-to-sql
    Python
    NOASSERTION
    22.8K stars
    2.2K forks
    22.8K watching
    Updated 2/27/2026
    View on GitHub
    Backblaze Advertisement

    Loading star history...

    Health Score

    75

    Weekly Growth

    +0

    +0.0% this week

    Contributors

    1

    Total contributors

    Open Issues

    14

    Generated Insights

    About pandas-ai

    PandasAI

    Release CI CD Coverage Discord Downloads License: MIT Open in Colab

    PandasAI is a Python platform that makes it easy to ask questions to your data in natural language. It helps non-technical users to interact with their data in a more natural way, and it helps technical users to save time, and effort when working with data.

    ๐Ÿ”ง Getting started

    You can find the full documentation for PandasAI here.

    You can either decide to use PandasAI in your Jupyter notebooks, Streamlit apps, or use the client and server architecture from the repo.

    ๐Ÿ“š Using the library

    Python Requirements

    Python version 3.8+ <3.12

    ๐Ÿ“ฆ Installation

    You can install the PandasAI library using pip or poetry.

    With pip:

    pip install "pandasai>=3.0.0b2"
    

    With poetry:

    poetry add "pandasai>=3.0.0b2"
    

    ๐Ÿ’ป Usage

    Ask questions

    import pandasai as pai
    from pandasai_openai.openai import OpenAI
    
    llm = OpenAI("OPEN_AI_API_KEY")
    
    pai.config.set({
        "llm": llm
    })
    
    # Sample DataFrame
    df = pai.DataFrame({
        "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
        "revenue": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
    })
    
    df.chat('Which are the top 5 countries by sales?')
    
    China, United States, Japan, Germany, Australia
    

    Or you can ask more complex questions:

    df.chat(
        "What is the total sales for the top 3 countries by sales?"
    )
    
    The total sales for the top 3 countries by sales is 16500.
    

    Visualize charts

    You can also ask PandasAI to generate charts for you:

    df.chat(
        "Plot the histogram of countries showing for each one the gd. Use different colors for each bar",
    )
    

    Chart

    Multiple DataFrames

    You can also pass in multiple dataframes to PandasAI and ask questions relating them.

    import pandasai as pai
    from pandasai_openai.openai import OpenAI
    
    employees_data = {
        'EmployeeID': [1, 2, 3, 4, 5],
        'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
        'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
    }
    
    salaries_data = {
        'EmployeeID': [1, 2, 3, 4, 5],
        'Salary': [5000, 6000, 4500, 7000, 5500]
    }
    
    llm = OpenAI("OPEN_AI_API_KEY")
    
    pai.config.set({
        "llm": llm
    })
    
    employees_df = pai.DataFrame(employees_data)
    salaries_df = pai.DataFrame(salaries_data)
    
    
    pai.chat("Who gets paid the most?", employees_df, salaries_df)
    
    Olivia gets paid the most.
    

    Docker Sandbox

    You can run PandasAI in a Docker sandbox, providing a secure, isolated environment to execute code safely and mitigate the risk of malicious attacks.

    Python Requirements
    pip install "pandasai-docker"
    
    Usage
    import pandasai as pai
    from pandasai_docker import DockerSandbox
    from pandasai_openai.openai import OpenAI
    
    # Initialize the sandbox
    sandbox = DockerSandbox()
    sandbox.start()
    
    employees_data = {
        'EmployeeID': [1, 2, 3, 4, 5],
        'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
        'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
    }
    
    salaries_data = {
        'EmployeeID': [1, 2, 3, 4, 5],
        'Salary': [5000, 6000, 4500, 7000, 5500]
    }
    
    llm = OpenAI("OPEN_AI_API_KEY")
    
    pai.config.set({
        "llm": llm
    })
    
    employees_df = pai.DataFrame(employees_data)
    salaries_df = pai.DataFrame(salaries_data)
    
    pai.chat("Who gets paid the most?", employees_df, salaries_df, sandbox=sandbox)
    
    # Don't forget to stop the sandbox when done
    sandbox.stop()
    
    Olivia gets paid the most.
    

    You can find more examples in the examples directory.

    ๐Ÿ“œ License

    PandasAI is available under the MIT expat license, except for the pandasai/ee directory of this repository, which has its license here.

    If you are interested in managed PandasAI Cloud or self-hosted Enterprise Offering, contact us.

    Resources

    Beta Notice
    Release v3 is currently in beta. The following documentation and examples reflect the features and functionality in progress and may change before the final release.

    • Docs for comprehensive documentation
    • Examples for example notebooks
    • Discord for discussion with the community and PandasAI team

    ๐Ÿค Contributing

    Contributions are welcome! Please check the outstanding issues and feel free to open a pull request. For more information, please check out the contributing guidelines.

    Thank you!

    Contributors

    Discover Repositories

    Search across tracked repositories by name or description