Securing Machine Learning Model Ecosystems: A Comprehensive Security Analysis
Current Research Project: This research is ongoing and represents a comprehensive investigation into machine learning model hub security vulnerabilities.
Collaboration
Research Collaboration: Since January 2025, this project is being conducted in collaboration with
Mohammad Latif Siddique, Ph.D. candidate at University of Notre Dame, USA. Mohammad specializes in software engineering, software security, code generation, and applied machine learning, and is currently a Ph.D. intern at Meta (Summer 2025) working with WhatsApp Core Consumer Messaging Groups & Communities on LLM applications.
Abstract
Machine learning model hubs have become critical infrastructure for AI development, hosting millions of pre-trained models that power modern AI applications. However, recent research has revealed that these platforms face significant security challenges, particularly related to remote code execution (RCE) vulnerabilities. This ongoing research project conducts a comprehensive security analysis of 15 major ML platforms, building upon the foundational work of Zhao et al. (2024) and expanding the understanding of ML ecosystem security threats.
Research Overview
This investigation represents a systematic examination of the machine learning model supply chain security, treating ML models as executable code rather than mere data files—a paradigm shift that fundamentally changes how we approach ML security. The research encompasses industry leaders like Hugging Face Hub (752,000+ models), Kaggle (355,000+ datasets), TensorFlow Hub, and PyTorch Hub, as well as emerging platforms across different geographical regions.
Key Research Questions
- Security Maturity Assessment: How do different ML platforms compare in their security implementations?
- Vulnerability Classification: What are the primary attack vectors in modern ML model ecosystems?
- Defensive Mechanisms: Which security technologies effectively mitigate RCE risks?
- Supply Chain Security: How can we establish secure practices for ML model distribution?
Methodology & Scope
The research methodology employs a multi-phase approach:
- Comprehensive Coverage: 15 major ML platforms across different ecosystems
- Security Framework Assessment: Evaluation using established security analysis frameworks
- Vulnerability Discovery: Systematic identification of potential attack vectors
Phase 2: Threat Modeling
- Attack Vector Analysis: Classification of RCE vulnerabilities in ML contexts
- Real-world Case Studies: Analysis of actual security incidents (JFrog’s 100+ malicious models, ReversingLabs’ “NullifAI” attacks)
- Platform Comparison: Detailed security maturity assessment
Phase 3: Defensive Technology Evaluation
- SafeTensors Analysis: Evaluation of secure serialization formats
- Scanning Pipeline Assessment: Analysis of automated detection systems like MalHug framework
- Runtime Protection: eBPF monitoring, container sandboxing, and cryptographic signing
Key Findings
Security Landscape Overview
The research reveals significant security maturity variations across platforms:
Advanced Security Platforms (e.g., Hugging Face):
- Multi-layered defense systems
- ClamAV antivirus scanning
- PickleScan for malicious pickle detection
- TruffleHog for secret detection
- Partnership with Protect AI’s Guardian technology
- “Zero trust” approach implementation
Basic Security Platforms (e.g., PyTorch Hub):
- Minimal protective measures
- User-responsibility security model
- Limited automated scanning
- Basic policy enforcement
Critical Vulnerability Patterns
Serialization Vulnerabilities:
- Over 55% of models use potentially vulnerable formats
- Python pickle format enables arbitrary code execution
- Legacy serialization methods lack security controls
Recent CVE Discoveries:
- CVE-2025-1550: Keras Lambda layer code execution in “safe mode”
- CVE-2024-27132: MLflow YAML recipe injection leading to RCE
- Framework-level vulnerabilities extending beyond model files
Innovative Security Solutions
SafeTensors Technology:
- Pure tensor data storage without code execution capability
- Physical elimination of deserialization attacks
- Backward compatibility with existing ML workflows
Advanced Detection Systems:
- MalHug Framework: 91 malicious models detected across 705,000+ models
- Static analysis combined with taint tracking
- Automated threat identification and classification
Technical Innovation
Security Maturity Framework
The research develops a comprehensive assessment matrix categorizing platforms from “Basic” to “Advanced” based on:
- Automated Scanning: Virus detection, pickle analysis, secret scanning
- Access Controls: Authentication, authorization, rate limiting
- Content Validation: Model verification, signature checking
- Incident Response: Threat detection, automated remediation
- Community Safety: Reporting mechanisms, moderation systems
Runtime Security Architecture
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Model Upload │───▶│ Security Scanner │───▶│ Safe Storage │
│ & Validation │ │ Pipeline │ │ & Delivery │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ eBPF Monitoring │ │ Container │ │ Cryptographic │
│ & Threat │ │ Sandboxing │ │ Verification │
│ Detection │ │ & Isolation │ │ & Signing │
└─────────────────┘ └──────────────────┘ └─────────────────┘
Industry Impact & Real-World Applications
Case Studies Analysis
Hugging Face Evolution (2021-2025):
- Progression from minimal security to comprehensive “zero trust” approach
- 4.47 million model versions scanned by April 2025
- Tens of thousands of security issues flagged and resolved
Attack Sophistication Trends:
- Evolution from simple malicious uploads to sophisticated bypass techniques
- “NullifAI” attacks successfully evading security scanners
- Supply chain compromise through dependency injection
- Mandatory Security Scanning: Implement multi-layer automated analysis
- Secure Serialization: Transition to SafeTensors or equivalent secure formats
- Runtime Isolation: Deploy container-based sandboxing for model execution
- Cryptographic Verification: Establish model signing and verification workflows
- Community Governance: Implement robust reporting and moderation systems
Future Research Directions
Emerging Threats
- AI-Generated Malware: Models trained to generate malicious code
- Model Poisoning: Subtle backdoors in seemingly legitimate models
- Supply Chain Attacks: Compromise through dependencies and infrastructure
Defensive Innovation
- Zero-Trust Model Execution: Assumption of model maliciousness by default
- Behavioral Analysis: Runtime monitoring of model behavior patterns
- Federated Security: Collaborative threat intelligence across platforms
Standardization Efforts
- Industry Security Standards: Establishment of ML security best practices
- Compliance Frameworks: Integration with existing cybersecurity regulations
- Certification Programs: Security validation for ML model publishers
Research Impact & Contributions
This research provides:
- Comprehensive Security Assessment: First systematic analysis of ML platform security across 15 major hubs
- Practical Security Framework: Actionable recommendations for platform operators
- Threat Intelligence: Detailed analysis of real-world attack patterns
- Defensive Technology Evaluation: Assessment of current and emerging security solutions
The findings directly inform:
- Platform Security Policies: Evidence-based security implementations
- Developer Best Practices: Secure model development and distribution guidelines
- Risk Assessment Frameworks: Quantitative security evaluation methodologies
- Industry Standards: Contribution to emerging ML security standards
Conclusion
This ongoing research represents a crucial step toward securing the foundation of modern AI development. As machine learning models become increasingly integrated into critical infrastructure and decision-making processes, understanding and mitigating security risks in the ML supply chain becomes paramount.
The comprehensive security maturity matrix and defensive recommendations developed through this research serve as both an assessment tool for current platforms and a roadmap for emerging hubs seeking to implement robust security measures from inception.
By treating machine learning models as part of the software supply chain—with all associated security considerations—this work provides essential groundwork for establishing industry standards and best practices in an era where AI systems are becoming ubiquitous in society.
References
- Zhao, H., Chen, H., Yang, F., et al. (2024). "Models Are Codes: Towards Measuring Malicious Code Poisoning Attacks on Pre-trained Model Hubs." arXiv:2409.09368. Link
- JFrog Security Research Team. (2023). "Machine Learning Security: Malicious Models on Hugging Face." Link
- ReversingLabs. (2024). "NullifAI: Novel Attack Techniques Bypassing ML Security Scanners." Referenced in: Dark Reading
- Hugging Face. (2024). "2024 Security Feature Highlights." Hugging Face Blog. Link
- Hugging Face & Protect AI. (2025). "4M Models Scanned: Protect AI + Hugging Face 6 Months In." Hugging Face Blog. Link
- MIT Cybersecurity Research. (2024). "Hugging Face AI Platform Riddled With 100 Malicious Code Execution Models." Link
- PyTorch Team. (2024). "Security Policy." PyTorch GitHub Repository. Link
- MITRE Corporation. (2025). "CVE-2025-1550: Keras Lambda Layer Code Execution Vulnerability."
- MITRE Corporation. (2024). "CVE-2024-27132: MLflow YAML Recipe Injection Leading to RCE."
- Hugging Face. (2022). "SafeTensors: A New Simple Format for Storing Tensors Safely." Hugging Face Documentation.
- Davis, J. (2024). "An Empirical Study of Artifacts and Security Risks in the Pre-trained Model Supply Chain." Medium. Link
- Forbes Security Coverage. (2024). "Hackers Have Uploaded Thousands Of Malicious Files To AI's Biggest Online Repository." Link