admin 发表于 2024-8-28 14:43:54

Looking for AI model training pointers! Master's degree needed.

Please translate the following text into English, and only return one translation result without any other characters. Do not use the words 'premise explanation' or 'translate this part of the content': "I want to create an AI knowledge base for a company's internal training system. The database contains about 500 documents, tables, and images. I have followed online tutorials and figured out some things myself. Unfortunately, there were some issues with deploying Dify. I used a Singapore-based Aliyun chicken (2 cores, 4GB) to deploy it."
The maintenance of the knowledge base involves an Excel table that has around 3 million rows and contains thousands of related information. How can I upload each row and successfully index them? I've failed several times so far.
There are also a few books I'd like to add as indices. However, currently, I'm using the Embedding-V1 from Baidu, which seems to be frequently stuck and unable to index correctly. It can only handle small documents. Other knowledge bases manage to index most of the information but fail at high rates. Is it necessary to insert each piece of knowledge individually into the knowledge base?
Currently, my main knowledge base is mostly completed. The chatbot mode uses GPT-4O. The answers are somewhat rough. Many pieces of information cannot be retrieved from the knowledge base. I haven't started using it yet.
In addition, there are vector retrieval models such as rerank, TopK, and others, which I don't understand how to set up or train. Can you give me some guidance and answer questions in a simple and understandable way? If possible, please offer tuition for long-term study. I would like to learn from someone who can become my mentor.
页: [1]
查看完整版本: Looking for AI model training pointers! Master's degree needed.