Ahmed Adel’s Post

6mo

More details about the mighty phi-3 family of SLMs. Such a key development as GenAI Dev becomes more about always optimizing and orchestrating with LLMs routing to LLMs and SLMs continuously to balance complexity, latency, cost and more! https://lnkd.in/gGQYqNKt

Tiny but mighty: The Phi-3 small language models with big potential

news.microsoft.com

To view or add a comment, sign in

More Relevant Posts

FlowiseAI (YC S23)

4,905 followers
5mo
Report this post
Last week, with the announcements of GPT-4o and Google I/O, huge bets are on multi-modality agents. Today, we are excited to introduce Multi Agent Flow, powered by LangChain's LangGraph 🕸 Multi agent consists of a team of agents that collaborate together to complete a task delegated by a supervisor. Result is significantly better for long-running task. Here's why: ⚒ Dedicated prompt and tools for each agent 🔄 Reflective loop for auto-correction 🌐 Separate LLMs for different agent Multi Agent Flow supports: - Function Calling LLMs (Claude, Mistral, Gemini, OpenAI) - Multi Modality (image, speech & files coming soon) - API - Prompt input variables Available now in v1.8.0 Repo: https://lnkd.in/dsph3WMU

3 Comments
Like Comment
To view or add a comment, sign in
Anyscale

35,277 followers
3mo
Report this post
🚀 See how Handshake cut their LLM GPU costs by 50% with Anyscale. Discover how they: 💰 Reduced LLM GPU costs by 50% or more. 📈 Seamlessly scaled large language models (LLMs) without compromising performance. ⏱ Enhanced operational efficiency, enabling faster development cycles. Check out the full story here: https://lnkd.in/gSiKJDaY

How Handshake Saves 50% on LLM GPU Costs with Anyscale

anyscale.com

2 Comments
Like Comment
To view or add a comment, sign in
Ryan Booth

Leader, Manager and AI Engineer
5mo
Report this post
Chains vs group-chat has been a big differentiator between Flowise and AutoGen and other frameworks. Building logic to progress through a chain and repeat steps if needed adds complexity that is easily overcome by agents collaborating in a group-chat. The downside is that you shift this control from traditional programmatic steps to relying on prompt engineering to make group decisions. For some workflows, this is fine, but others require a more rigid progression of steps (e.g. CI/CD pipelines). A hybrid approach is the best of both worlds.

FlowiseAI (YC S23)

4,905 followers
5mo

Last week, with the announcements of GPT-4o and Google I/O, huge bets are on multi-modality agents. Today, we are excited to introduce Multi Agent Flow, powered by LangChain's LangGraph 🕸 Multi agent consists of a team of agents that collaborate together to complete a task delegated by a supervisor. Result is significantly better for long-running task. Here's why: ⚒ Dedicated prompt and tools for each agent 🔄 Reflective loop for auto-correction 🌐 Separate LLMs for different agent Multi Agent Flow supports: - Function Calling LLMs (Claude, Mistral, Gemini, OpenAI) - Multi Modality (image, speech & files coming soon) - API - Prompt input variables Available now in v1.8.0 Repo: https://lnkd.in/dsph3WMU

1 Comment
Like Comment
To view or add a comment, sign in
Andrei Lopatenko 🇺🇦

VP AI & Engineering | Co-Founder | Keynote speaker | Ex-Google, Apple, WML
4mo
Report this post
Efficiency of LLM infrastructures is one of the most important topics for wide scale LLM adoptions and there are many different subtopics such as optimization of inference, optimization of GPU allocations, scaling up architectures etc Here is a very interesting read from Character AI how they optimize their inference infrastructure for their production loads of 20000 qps https://lnkd.in/gzPPzUYZ

Optimizing AI Inference at Character.AI

research.character.ai

2 Comments
Like Comment
To view or add a comment, sign in
Teo Soares

Co-Founder at Collected Company | Ex-Google
7mo Edited
Report this post
Watch GPT and Google Gemini go head-to-head in a game of trivia. GPT 5 is reportedly coming this summer, and just last week, Google began making Gemini 1.5 available to all developers. Between those two—plus Claude, Mistral, Llama, Perplexity, and more—it's hard to know which model to use. Sure, you could test them, but human evaluations are expensive. That's why I'm interested in how LLMs themselves can be used for automated evals. In this demo, the questions and assessments are all AI-generated. It's LLMs grading answers given by LLMs to questions written by LLMs. Lots of caveats apply (read the FAQ!), but I had fun building it.

GPT vs. Gemini | Two LLMs, one winner

gptversusgemini.com

6 Comments
Like Comment
To view or add a comment, sign in
Anant Upadhyay

Engineer| ML Enthusiast
2mo
Report this post
#Day22 of #100DaysOfGenAI Today, I delved into the concept of LLMOps, which focuses on the operational management of large language models, and explored Vext, a platform designed to simplify the LLM pipeline. Vext provides a suite of tools that make deploying, maintaining, and scaling LLMs more efficient. By streamlining the complexities of LLM workflows, Vext ensures smoother integration of these models into real-world applications. Understanding platforms like Vext is becoming increasingly important as the role of LLMs expands across various industries. This knowledge enhances my ability to work effectively with advanced AI systems, ensuring they operate at peak efficiency in production environments. Vext: https://vextapp.com #LLMOps #MachineLearning #AI #DeepLearning #LLM #Vext #100DaysOfCode

Vext - The LLMOps OS: LLM Pipeline Simplified

vextapp.com
Like Comment
To view or add a comment, sign in
Dr Raj Kalra MD

MD(Anesthesia,Urgent Care)UrologyResearch UPENN AI,HealthcareAI SAI/Robotics/NeuroAI/Multiomics/Energy/ReverseAging Open for AIStartups seeking strategic investments. HI-HSI>AI-AGI-ASI/HI-HSI<AI-AGI-ASI(?)
2mo
Report this post
<<OneGen, a novel solution that unifies the retrieval and generation processes into a single forward pass within an LLM. By integrating autoregressive retrieval tokens into the model, OneGen enables the system to handle both tasks simultaneously without the need for multiple forward passes or separate retrieval and generation models. This innovative approach significantly reduces computational overhead and inference time, enhancing the efficiency of LLMs.>>

OneGen: An AI Framework that Enables a Single LLM to Handle both Retrieval and Generation Simultaneously

https://www.marktechpost.com

1 Comment
Like Comment
To view or add a comment, sign in
CST - Cyber Sapient

31,058 followers
4mo
Report this post
#Google #DeepMind presents a new hybrid architecture which enables tokens in the LLM to cross-attend to node embeddings from a GNN-based neural algorithmic reasoner (NAR). The resulting model, called TransNAR, demonstrates improvements in OOD reasoning across algorithmic tasks. Quotes from the paper on why NARs could be useful: "NARs are capable of holding perfect generalization even on 6× larger inputs than ones seen in the training set, for highly complex algorithmic tasks with long rollouts". The key here is the generalization that you are getting from NARs when combined with Transformers. https://lnkd.in/gUvpSWTt Google
Like Comment
To view or add a comment, sign in
Towards AI

268,628 followers
6mo
Report this post
Best Resources to Learn & Understand Evaluating LLMs via #TowardsAI → https://bit.ly/3WAPbs2

Best Resources to Learn & Understand Evaluating LLMs

towardsai.net
Like Comment
To view or add a comment, sign in
Antonio Gulli

Google Sr Director, CTO Office. AI Cloud Search HAM: HB9IAZ IU5SKA. Angel Investor
9mo
Report this post
"Through a series of machine learning innovations, we’ve increased 1.5 Pro’s context window capacity far beyond the original 32,000 tokens for Gemini 1.0. We can now run up to 1 million tokens in production. This means 1.5 Pro can process vast amounts of information in one go — including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words. In our research, we’ve also successfully tested up to 10 million tokens.' https://lnkd.in/ddrFBMnj

Our next-generation model: Gemini 1.5

blog.google
Like Comment
To view or add a comment, sign in

14,205 followers

View Profile Connect

Ahmed Adel’s Post

Tiny but mighty: The Phi-3 small language models with big potential

news.microsoft.com

More from this author

Who's Continuously Moving My Cheese?

Talent Management for the Era of AI ("When it Matters"!)

A 3x3 Lens into the New Age of AI

Explore topics