Voice call capture options

Capturing data from end users in Voice

In many Voice conversations, it’s important to collect information from end users—whether that’s a booking reference number to update a hotel reservation or details about when an order was received to provide accurate return eligibility guidance. Capturing the right data at the right time ensures the AI Agent can provide effective and relevant support.

To support this, Ada Voice offers multiple capture options—ways an AI Agent can collect information during a call. While end users can often provide answers by speaking naturally, certain types of data may be better collected through other channels. Depending on what you’re capturing, you can also let end users respond via SMS or use their phone’s dial pad (DTMF tones).

These options are especially useful when collecting complex or error-prone information. For example, if you’re asking for a 20-digit order number, letting end users enter it using their dial pad can improve accuracy. Or if you’re requesting something like an email or home address—data that’s tricky to transcribe correctly over the phone—you can collect it via SMS instead.

By choosing the right capture mode for each scenario, you set both your AI Agent and your end users up for success—making conversations smoother, more accurate, and more efficient. The sections below walk through the available options to help you configure the best setup for your Voice experience.

Capture options for Action inputs

Actions enable your AI Agent to integrate with systems that are external to Ada using API calls. Often times, you will need to capture Inputs from end users to use in your Actions. For example, if you have an Action that looks up an end user’s order status, then you may need to collect an order number from the end user.

If you’re using your AI Agent in Voice, then you’ll also be able to define a Voice call capture option for each of these inputs. Capture options define how each input is gathered, and selecting the right one ensures that data collection feels natural, efficient, and appropriate for the type of information being requested. You can choose from the following capture options for Action inputs:

  • Speech: The AI Agent asks the end user for an input, and they respond by speaking. This is the most conversational and commonly used option and is the default option for all Action inputs.
  • Speech and DTMF (Dial Pad Input): The AI Agent asks the end user for an input, and they respond by speaking or by entering the input using their phone’s dial pad. This is ideal for structured numeric inputs like order numbers, dates, or codes.
  • Speech and SMS: The AI Agent asks the end user if they want to provide an input by speaking it or by responding to a text message. If the end user indicates they want to use speech, they can just say the input. If the end user indicates they want to respond to a text message, the AI Agent sends a text message to the end user’s mobile device with a prompt, and the end user replies. This is especially effective for collecting more detailed or sensitive information, such as an email or mailing address.

By offering flexible capture options, you make it easier for end users to provide the information your AI Agent needs—leading to more successful and seamless Voice experiences.

Your AI Agent will always get the end user’s consent before sending SMS messages. For more information, see SMS consent.