Grok - An artificial intelligence (AI) product from xAI - a company founded by Elon Musk, is expected to soon be upgraded with the ability to process multimedia information. This information was revealed through developer documentation released by xAI.
In March 2024, Grok made significant progress with the release og Grok 1.5, featuring greatly improved reasoning capabilities. Preciously, in a blog post last month, xAI hinted at Grok-1.5V offering "multimodal models in a number of domains." Recent updates in the developer documentation suggest that xAI is prepared to launch a new AI model. This implies that users may soon be able to upload photos to Grok and receive text-based answers. Specially, the developer documentation demonstrates how developers can use the xAI software development kit (SDK) to generate a response based on both text and images. A sample Python script illustrates the process of reading an image file, setting up a text prompt, and using the xAI SDK to generate a response.
Launched in November 2023 and exclusively available to X Premium Plus subscribers, Grok is considered a "newcomer" in the AI field compared to heavyweight competitors like OpenAI's ChatGPT. Grok's standout feature is its real-time information access, including posts on the X platform. According to xAI, the Grok model is trained on "multiple public text data sources on the internet up to the third quarter of 2023 and data sets reviewed and selected by evaluators." A blog post by X also affirms that Grok-1 is not trained on X data (including public X posts). However, xAI acknowledges that large languages models are often criticized as they can perform well on brenchmarks if those benchmarks are included in their training data. This is akin to memorizing answers in a test rather than truly understanding the content.
Nevertheless, according to a blog post by xAI, Grok 1.5 is gradually closing the gap with GPT-4 on various evaluation standards, from elementary school level to high school level competitions. The multimodal chatbot is seen as the next frontier in the AI race. Major players in the industry such as Google have announced new developments at the Google I/O event, while OpenAI has also unveiled GPT-4o. The lack of multimodal capability has left Grok trailing behind until now. With ongoing upgrade efforts, could Grok spring a surprise in this challenging race?
0 Comments