LLM Inference Speed (Tech Deep Dive) Thinking Machines: AI & Philosophy podcast

Artwork

Tech Machine Learning Artificial Intelligence Society Philosophy Daniel Reid Cahn MLOps

Daniel Reid Cahn द्वारा प्रदान की गई सामग्री. एपिसोड, ग्राफिक्स और पॉडकास्ट विवरण सहित सभी पॉडकास्ट सामग्री Daniel Reid Cahn या उनके पॉडकास्ट प्लेटफ़ॉर्म पार्टनर द्वारा सीधे अपलोड और प्रदान की जाती है। यदि आपको लगता है कि कोई आपकी अनुमति के बिना आपके कॉपीराइट किए गए कार्य का उपयोग कर रहा है, तो आप यहां बताई गई प्रक्रिया का पालन कर सकते हैं https://hi.player.fm/legal।

Thinking Machines: AI & Philosophy « »
LLM Inference Speed (Tech Deep Dive)

1y ago 39:36

साझा करें

MP3•एपिसोड होम

Daniel Reid Cahn द्वारा प्रदान की गई सामग्री. एपिसोड, ग्राफिक्स और पॉडकास्ट विवरण सहित सभी पॉडकास्ट सामग्री Daniel Reid Cahn या उनके पॉडकास्ट प्लेटफ़ॉर्म पार्टनर द्वारा सीधे अपलोड और प्रदान की जाती है। यदि आपको लगता है कि कोई आपकी अनुमति के बिना आपके कॉपीराइट किए गए कार्य का उपयोग कर रहा है, तो आप यहां बताई गई प्रक्रिया का पालन कर सकते हैं https://hi.player.fm/legal।

In this tech talk, we dive deep into the technical specifics around LLM inference.

The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?

We jump into:

Is fast model inference the real moat for LLM companies?
What are the implications of slow model inference on the future of decentralized and edge model inference?
As demand rises, what will the latency/throughput tradeoff look like?
What innovations on the horizon might massively speed up model inference?

… continue reading

23 एपिसोडस

#Tech #Machine Learning #Artificial Intelligence #Society #Philosophy #Daniel Reid Cahn #MLOps

Artwork

LLM Inference Speed (Tech Deep Dive)

Thinking Machines: AI & Philosophy

published 1y ago

साझा करें

MP3•एपिसोड होम

Daniel Reid Cahn द्वारा प्रदान की गई सामग्री. एपिसोड, ग्राफिक्स और पॉडकास्ट विवरण सहित सभी पॉडकास्ट सामग्री Daniel Reid Cahn या उनके पॉडकास्ट प्लेटफ़ॉर्म पार्टनर द्वारा सीधे अपलोड और प्रदान की जाती है। यदि आपको लगता है कि कोई आपकी अनुमति के बिना आपके कॉपीराइट किए गए कार्य का उपयोग कर रहा है, तो आप यहां बताई गई प्रक्रिया का पालन कर सकते हैं https://hi.player.fm/legal।

In this tech talk, we dive deep into the technical specifics around LLM inference.

The big question is: Why are LLMs slow? How can they be faster? And might slow inference affect UX in the next generation of AI-powered software?

We jump into:

Is fast model inference the real moat for LLM companies?
What are the implications of slow model inference on the future of decentralized and edge model inference?
As demand rises, what will the latency/throughput tradeoff look like?
What innovations on the horizon might massively speed up model inference?

… continue reading

23 एपिसोडस

#Tech #Machine Learning #Artificial Intelligence #Society #Philosophy #Daniel Reid Cahn #MLOps

Все серии

×

प्लेयर एफएम में आपका स्वागत है!

प्लेयर एफएम वेब को स्कैन कर रहा है उच्च गुणवत्ता वाले पॉडकास्ट आप के आनंद लेंने के लिए अभी। यह सबसे अच्छा पॉडकास्ट एप्प है और यह Android, iPhone और वेब पर काम करता है। उपकरणों में सदस्यता को सिंक करने के लिए साइनअप करें।

500+ विषयों को सुनो

त्वरित संदर्भ मार्गदर्शिका

शीर्ष पॉडकास्ट

विवेचना

Hindi News - NHK WORLD RADIO JAPAN

RED FM LOVE STORY by RJ PAHI