Edit

Use ai.summarize with pandas

The ai.summarize function uses generative AI to produce summaries of input text, with a single line of code. The function can either summarize values from one column of a DataFrame or values across all the columns.

Note

Overview

The ai.summarize function extends the pandas Series class. To summarize each row value from that column alone, call the function on a pandas DataFrame text column. You can also call the ai.summarize function on an entire DataFrame to summarize values across all the columns.

The function returns a pandas Series that contains summaries, which can be stored in a new DataFrame column.

Syntax

df["summaries"] = df["text"].ai.summarize()

Parameters

Name Description
instructions
Optional
A string that provides more context for the AI model, such as output length, tone, audience, or focus. More precise instructions produce better results.

Returns

The function returns a pandas Series that contains summaries for each input text row. If the input text is null, the result is null.

Example

# This code uses AI. Always review output for mistakes.

df= pd.DataFrame([
        ("Microsoft Teams", "2017",
        """
        The ultimate messaging app for your organization—a workspace for real-time 
        collaboration and communication, meetings, file and app sharing, and even the 
        occasional emoji! All in one place, all in the open, all accessible to everyone.
        """),
        ("Microsoft Fabric", "2023",
        """
        An enterprise-ready, end-to-end analytics platform that unifies data movement, 
        data processing, ingestion, transformation, and report building into a seamless, 
        user-friendly SaaS experience. Transform raw data into actionable insights.
        """)
    ], columns=["product", "release_year", "description"])

df["summaries"] = df["description"].ai.summarize()
display(df)

This example code cell provides the following output:

Screenshot showing a data frame. The 'summaries' column has a summary of the 'description' column only, in the corresponding row.

Customize summaries with instructions

Use the instructions parameter to control the tone, length, audience, or focus of generated summaries without changing the source text.

# This code uses AI. Always review output for mistakes.

df["executive_summary"] = df["description"].ai.summarize(
    instructions="Write one concise sentence for a business executive. Focus on product value and avoid marketing language."
)
display(df)

Multimodal input

The ai.summarize function supports file-based multimodal input. This capability is part of multimodal AI Functions, which process images, PDFs, and text files alongside text data. You can summarize file content by setting column_type="path" when your column contains file path strings. For more information about supported file types and setup, see Use multimodal input with AI Functions.

# This code uses AI. Always review output for mistakes.

custom_df["summary"] = custom_df["file_path"].ai.summarize(
    instructions="Summarize this file in one sentence for a support analyst.",
    column_type="path",
)
display(custom_df)