Examining the Reliability of Logical Reasoning in Large Language Models
A recent study highlights the inconsistencies in reasoning paths of large language models, raising concerns about their reliability in generating answers.
6 articles tagged with "LLM"
A recent study highlights the inconsistencies in reasoning paths of large language models, raising concerns about their reliability in generating answers.
A recent study on arXiv investigates how well LLM judges align with human judgment in text evaluation, a critical factor in their reliability.
This piece delves into the evaluation methods for LLM judges, focusing on their robustness and the effects of post-decision interactions within benchmarking frameworks.
A recent study delves into the complexities of large language model (LLM) agents, focusing on the distinction between harness updating and harness benefit in their task execution.
Tiny-vLLM, an open-source inference engine optimized for large language models, leverages C++ and CUDA for enhanced performance and efficiency.
A recent study uncovers serious vulnerabilities in multi-agent LLM systems, highlighting the threat posed by domain-camouflaged injection attacks that evade detection.