nova-act-mcp

by madtank

Server

nova-act-mcp

An MCP server providing tools to control web browsers using the Amazon Nova Act SDK. Enables interactive browser automation with transparent agent reasoning via MCP.

Nova Act MCP Example

What is nova-act-mcp?

Nova Act MCP is a bridge between Amazon's Nova Act browser automation SDK and the Model Context Protocol (MCP). It allows AI assistants like Claude to control web browsers to perform tasks through natural language instructions, while providing visibility into the agent's reasoning process.

This project exposes Nova Act's powerful browser automation capabilities through an MCP server interface, making it easy to:

Control web browsers directly from AI assistants
Execute interactive browser automation tasks
Maintain browser sessions between interactions
See the agent's step-by-step reasoning process

Prerequisites

Before getting started, you'll need:

Python 3.10 or higher
An Amazon Nova Act API key (get one from https://nova.amazon.com/act)
Claude Desktop application (for using with Claude)

Installation

Clone this repository:

git clone https://github.com/yourusername/nova-act-mcp.git
cd nova-act-mcp

Install dependencies:
```
pip install -r requirements.txt
```

Getting Started

This guide focuses on setting up the server with the Claude Desktop application, but nova-act-mcp is designed to work with any Model Context Protocol (MCP) compatible client, such as Visual Studio Code or others. Consult your specific client's documentation for instructions on integrating MCP servers.

Nova Act MCP Claude Example

1. Obtain a Nova Act API Key

Go to https://nova.amazon.com/act
Generate a new API key
Save it for the next step

2. Configure Your MCP Client (Example: Claude Desktop)

The following steps show how to configure the Claude Desktop application. If you are using a different MCP client, adapt these instructions accordingly.

Install Claude Desktop from https://claude.ai/desktop (if using Claude).

Open your Claude for Desktop App configuration:

# MacOS
code ~/Library/Application\ Support/Claude/claude_desktop_config.json

# Windows
code %USERPROFILE%\AppData\Roaming\Claude\claude_desktop_config.json

Add the nova-act-mcp server configuration:

{
  "mcpServers": {
    "nova-browser": {
      "command": "uv",
      "args": [
        "--directory",
        "/full/path/to/nova-act-mcp",
        "run",
        "nova_mcp.py"
      ],
      "transport": "stdio",
      "env": {
        "NOVA_ACT_API_KEY": "your_api_key_here"
      }
    }
  }
}

Replace:

/full/path/to/nova-act-mcp with the absolute path to where you cloned this repository
your_api_key_here with your actual Nova Act API key

Save the file and restart Claude Desktop

3. Using with Your MCP Client (Example: Claude)

Once configured, you can use the browser automation tool with your MCP client. For Claude Desktop, look for the hammer icon (🔨), which indicates available MCP tools.

Try a simple example like this:

Can you help me search for news on Google? Use the nova-browser tool to:
1. Start a session on google.com
2. Execute the instruction "search for news"
3. Tell me what you see and your reasoning process

Understanding Agent Thinking

One of the key features of nova-act-mcp is the ability to see the agent's reasoning process. When the agent performs actions in the browser, you'll see detailed explanations of:

How it's interpreting your instructions
What elements it sees on the page
Why it's choosing specific actions
How it confirms successful completion

This transparency helps you understand how the agent is approaching tasks and makes debugging easier when something doesn't work as expected.

Tips for Effective Browser Automation

Be Specific and Concise

When prompting for browser actions:

✅ DO:

Provide clear, actionable instructions
Focus on one task at a time
Be specific about what elements to interact with

❌ DON'T:

Use vague or ambiguous language
Request complex multi-step actions in a single instruction
Ask for subjective evaluations without clear criteria

Example of Good Instructions

Instead of trying to do everything at once, break it down into interactive steps:

1. Start a session on amazon.com
2. Execute "search for bluetooth headphones"
3. Execute "filter by 4+ stars"
4. Execute "sort by price low to high"
5. Execute "find the price of the first item"

Advanced Features

Persistent Browser Sessions

The nova-act-mcp server maintains browser profiles in the profiles/ directory, allowing you to:

Maintain login sessions between uses
Keep cookies and local storage data
Resume workflows in later conversations

Each profile is isolated, so you can maintain different identities or login states.

Browser Session Management

The following actions are available for browser session management:

start - Starts a new browser session with a specified URL
execute - Executes a natural language instruction in the current session
end - Ends the current browser session and cleans up resources

Example flow:

# Start a session
start: https://www.google.com

# Execute instructions
execute: search for weather in San Francisco
execute: click on the first result

# End the session
end

Testing

You can test the nova-act-mcp server using the MCP Inspector tool:

# Install MCP Inspector if you haven't already
npm install -g @modelcontextprotocol/inspector

# Run the server with the MCP Inspector
NOVA_ACT_API_KEY="your_api_key_here" npx @modelcontextprotocol/inspector uv --directory /path/to/nova-act-mcp run nova_mcp.py

Then use the following input format to test a simple browser action:

{
  "action": "start",
  "url": "https://www.google.com"
}

Followed by:

{
  "action": "execute",
  "session_id": "your_session_id_from_previous_response",
  "instruction": "search for tea pots"
}

This lets you verify that the server is working correctly and see exactly what the agent is thinking during each step.

Troubleshooting

Nova Act API Key Issues

If you see an error about the Nova Act API key:

Verify your API key is valid at https://nova.amazon.com/act
Check that the key is correctly set in your Claude Desktop configuration
Try setting it as an environment variable: export NOVA_ACT_API_KEY="your_key_here"

Browser Automation Problems

If the browser is not behaving as expected:

Check the agent_thinking field in the response to see how the agent is interpreting your instructions
Make your instructions more specific and actionable
For tasks involving forms or logins, be explicit about field names

Debug Mode

To enable more detailed debugging information:

export NOVA_MCP_DEBUG=1

This will include additional diagnostic information in the responses.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

Amazon Nova Act for providing the browser automation SDK
Model Context Protocol (MCP) for the AI agent communication standard

Feedback for Amazon Nova Act

While we've implemented agent thinking extraction, we note that this information should ideally be directly accessible through the Nova Act API. The current approach requires parsing HTML trace files, which is less efficient than having a dedicated API method. We welcome any future improvements to the Nova Act SDK that would make this process more straightforward.

Limitations

No file uploads: The browser automation can't upload files from your local system
Limited to web interactions: Can only interact with elements visible on the webpage
Some sites may block automation: Sites with strong anti-bot measures may present challenges
Session timeouts: Long-running sessions may be terminated by websites

Related Services

playwright-mcp

Server

★ 4.8 (120)

View Details →

blender-mcp

Server

★ 4.8 (120)

View Details →

tavily-mcp

Server

★ 4.8 (120)

View Details →

nova-act-mcp

Tags

nova-act-mcp

What is nova-act-mcp?

Prerequisites

Installation

Getting Started

1. Obtain a Nova Act API Key

2. Configure Your MCP Client (Example: Claude Desktop)

3. Using with Your MCP Client (Example: Claude)

Understanding Agent Thinking

Tips for Effective Browser Automation

Be Specific and Concise

Example of Good Instructions

Advanced Features

Persistent Browser Sessions

Browser Session Management

Testing

Troubleshooting

Nova Act API Key Issues

Browser Automation Problems

Debug Mode

License

Acknowledgements

Feedback for Amazon Nova Act

Limitations

Related Services

playwright-mcp

blender-mcp

tavily-mcp