Player FM ऐप के साथ ऑफ़लाइन जाएं!
RL, but don't do anything I wouldn't do
Manage episode 444509914 series 3524393
The paper critiques KL regularization in reinforcement learning, showing it fails with Bayesian predictive models, and proposes a new principle to better control advanced RL agent behavior.
https://arxiv.org/abs//2410.06213
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1589 एपिसोडस
Manage episode 444509914 series 3524393
The paper critiques KL regularization in reinforcement learning, showing it fails with Bayesian predictive models, and proposes a new principle to better control advanced RL agent behavior.
https://arxiv.org/abs//2410.06213
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1589 एपिसोडस
すべてのエピソード
×प्लेयर एफएम में आपका स्वागत है!
प्लेयर एफएम वेब को स्कैन कर रहा है उच्च गुणवत्ता वाले पॉडकास्ट आप के आनंद लेंने के लिए अभी। यह सबसे अच्छा पॉडकास्ट एप्प है और यह Android, iPhone और वेब पर काम करता है। उपकरणों में सदस्यता को सिंक करने के लिए साइनअप करें।