This page shows you how to use data connectors to access your data stored in Cloud Storage, Google Drive, Slack, or Jira and how to use that data with LlamaIndex on Vertex AI for RAG. The Import RagFiles API provides data connectors to these data sources.
Import files from Cloud Storage or Google Drive
To import files from Cloud Storage or Google Drive into your corpus, do the following:
- Create a corpus by following the instructions at Create a RAG corpus.
- Import your files from Cloud Storage or Google Drive by using the template.
Import files from Slack
To import files from Slack into your corpus, do the following:
- Create a corpus, which is an index that structures and optimizes your data for searching. Follow the instructions at Create a RAG corpus.
- Get your
CHANNEL_ID
from the Slack channel ID. - Create and set up an app to use with LlamaIndex on Vertex AI for RAG.
- From the Slack UI, in the Add features and functionality section, click Permissions.
- Add the following permissions:
channels:history
groups:history
im:history
mpim:history
- Click Install to Workspace to install the app into your Slack workspace.
- Click Copy to get your API token, which authenticates your identity and grants you access to an API.
- Add your API token to your Secret Manager.
- To view the stored secret, grant the Secret Manager Secret Accessor role to your project's LlamaIndex on Vertex AI for RAG service account.
The following curl and Python code samples demonstrate how to import files from your Slack resources.
curl
If you want to get messages from a specific channel, change the
CHANNEL_ID
.
API_KEY_SECRET_VERSION=SLACK_API_KEY_SECRET_VERSION
CHANNEL_ID=SLACK_CHANNEL_ID
PROJECT_ID=us-central1
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ ENDPOINT }/v1beta1/projects/${ PROJECT_ID }/locations/${ PROJECT_ID }/ragCorpora/${ RAG_CORPUS_ID }/ragFiles:import \
-d '{
"import_rag_files_config": {
"slack_source": {
"channels": [
{
"apiKeyConfig": {
"apiKeySecretVersion": "'"${ API_KEY_SECRET_VERSION }"'"
},
"channels": [
{
"channel_id": "'"${ CHANNEL_ID }"'"
}
]
}
]
}
}
}'
Python
If you want to get messages for a given range of time or from a specific channel, change any of the following fields:
- START_TIME
- END_TIME
- CHANNEL1 or CHANNEL2
# Slack example
start_time = protobuf.timestamp_pb2.Timestamp()
start_time.GetCurrentTime()
end_time = protobuf.timestamp_pb2.Timestamp()
end_time.GetCurrentTime()
source = rag.SlackChannelsSource(
channels = [
SlackChannel("CHANNEL1", "api_key1"),
SlackChannel("CHANNEL2", "api_key2", START_TIME, END_TIME)
],
)
response = rag.import_files(
corpus_name="projects/my-project/locations/us-central1/ragCorpora/my-corpus-1",
source=source,
chunk_size=512,
chunk_overlap=100,
)
Import files from Jira
To import files from Jira into your corpus, do the following:
- Create a corpus, which is an index that structures and optimizes your data for searching. Follow the instructions at Create a RAG corpus.
- To create an API token, sign in to the Atlassian site.
- Use {YOUR_ORG_ID}.atlassian.net as the SERVER_URI in the request.
- Use your Atlassian email as the EMAIL in the request.
- Provide
projects
orcustomQueries
with your request. To learn more about custom queries, see Use advanced search with Jira Query Language (JQL).When you import
projects
,projects
is expanded into the corresponding queries to get the entire project. For example,MyProject
is expanded toproject = MyProject
. - Click Copy to get your API token, which authenticates your identity and grants you access to an API.
- Add your API token to your Secret Manager.
- Grant Secret Manager Secret Accessor role to your project's LlamaIndex on Vertex AI for RAG service account.
curl
EMAIL=JIRA_EMAIL
API_KEY_SECRET_VERSION=JIRA_API_KEY_SECRET_VERSION
SERVER_URI=JIRA_SERVER_URI
CUSTOM_QUERY=JIRA_CUSTOM_QUERY
PROJECT_ID=JIRA_PROJECT
REGION= "us-central1"
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${ ENDPOINT }/v1beta1/projects/${ PROJECT_ID }/locations/REGION>/ragCorpora/${ RAG_CORPUS_ID }/ragFiles:import \
-d '{
"import_rag_files_config": {
"jiraSource": {
"jiraQueries": [{
"projects": ["'"${ PROJECT_ID }"'"],
"customQueries": ["'"${ CUSTOM_QUERY }"'"],
"email": "'"${ EMAIL }"'",
"serverUri": "'"${ SERVER_URI }"'",
"apiKeyConfig": {
"apiKeySecretVersion": "'"${ API_KEY_SECRET_VERSION }"'"
}
}]
}
}
}'
Python
# Jira Example
jira_query = rag.JiraQuery(
email="xxx@yyy.com",
jira_projects=["project1", "project2"],
custom_queries=["query1", "query2"],
api_key="api_key",
server_uri="server.atlassian.net"
)
source = rag.JiraSource(
queries=[jira_query],
)
response = rag.import_files(
corpus_name="projects/my-project/locations/REGION/ragCorpora/my-corpus-1",
source=source,
chunk_size=512,
chunk_overlap=100,
)
What's next
- To learn more about grounding, see Grounding overview.
- To learn more about LlamaIndex on Vertex AI for RAG, see Use LlamaIndex on Vertex AI for RAG.
- To learn more about grounding and RAG, see Ground responses using RAG.