Di Zhang's picture

Di Zhang

di-zhang-fdu

AI & ML interests

AI4Chem, LLM, Green LLM

Recent Activity

Organizations

AI4Chem's profile picture NVIDIA-Eagle's profile picture

di-zhang-fdu's activity

posted an update 12 days ago
view post
Post
326
Our new paper Moose-chem 3 introduces experiment-guided hypothesis ranking, a novel setting where candidate hypotheses are prioritized based on experimental feedback from previously tested hypotheses.

To support research in this area, the work proposes a simulator grounded in three domain-informed assumptions that can generate simulated experimental feedback without requiring costly real-world trials.

The simulator is validated on a curated dataset of 124 chemistry hypotheses, and the resulting method outperforms strong pre-experiment baselines. This enables scalable research on feedback-driven hypothesis discovery strategies in scientific domains where empirical validation is expensive or slow.
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback (2505.17873)
reacted to Kseniase's post with ๐Ÿš€ 12 days ago
view post
Post
4251
12 Types of JEPA

JEPA, or Joint Embedding Predictive Architecture, is an approach to building AI models introduced by Yann LeCun. It differs from transformers by predicting the representation of a missing or future part of the input, rather than the next token or pixel. This encourages conceptual understanding, not just low-level pattern matching. So JEPA allows teaching AI to reason abstractly.

Here are 12 types of JEPA you should know about:

1. I-JEPA -> Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture (2301.08243)
A non-generative, self-supervised learning framework designed for processing images. It works by masking parts of the images and then trying to predict those masked parts

2. MC-JEPA -> MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features (2307.12698)
Simultaneously interprets video data - dynamic elements (motion) and static details (content) - using a shared encoder

3. V-JEPA -> Revisiting Feature Prediction for Learning Visual Representations from Video (2404.08471)
Presents vision models trained by predicting future video features, without pretrained image encoders, text, negative sampling, or reconstruction

4. UI-JEPA -> UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity (2409.04081)
Masks unlabeled UI sequences to learn abstract embeddings, then adds a fine-tuned LLM decoder for intent prediction.

5. Audio-based JEPA (A-JEPA) -> A-JEPA: Joint-Embedding Predictive Architecture Can Listen (2311.15830)
Masks spectrogram patches with a curriculum, encodes them, and predicts hidden representations.

6. S-JEPA -> S-JEPA: towards seamless cross-dataset transfer through dynamic spatial attention (2403.11772)
Signal-JEPA is used in EEG analysis. It adds a spatial block-masking scheme and three lightweight downstream classifiers

7. TI-JEPA -> TI-JEPA: An Innovative Energy-based Joint Embedding Strategy for Text-Image Multimodal Systems (2503.06380)
Text-Image JEPA uses self-supervised, energy-based pre-training to map text and images into a shared embedding space, improving cross-modal transfer to downstream tasks

Find more types below ๐Ÿ‘‡

Also, explore the basics of JEPA in our article: https://www.turingpost.com/p/jepa

If you liked it, subscribe to the Turing Post: https://www.turingpost.com/subscribe
  • 1 reply
ยท
posted an update 3 months ago
posted an update 6 months ago
posted an update 6 months ago
replied to their post 6 months ago
view reply

We will write a short technical report for current progress.

reacted to their post with ๐Ÿš€ 6 months ago
view post
Post
3091
  • 3 replies
ยท
replied to their post 6 months ago
posted an update 6 months ago
view post
Post
3091
  • 3 replies
ยท
posted an update 6 months ago
view post
Post
1378
LLaMA-O1 Base and SFT model will be uploaded to HF today.
RLHF pipeline already ready, still waiting for data sampling.
  • 1 reply
ยท
replied to jwu323's post 6 months ago
reacted to jwu323's post with ๐Ÿš€ 6 months ago
view post
Post
1377
We are excited to announce a new internal project, Rome, focused on advancing LLM reasoning. The code and accompanying paper will be released soon. Stay tuned!
ยท
replied to their post 7 months ago
replied to their post 7 months ago
view reply

main.py is the entry for finetune, but codes need further improvements, see 'Call for contributors'

posted an update 7 months ago
view post
Post
2433
Discovered an outrageous bug on the ChatGPT official website, especially for those using ad-blocking plugins. Please make sure to add browser-intake-datadoghq.com to your ad block whitelist. The ChatGPT webpage relies on this site for heartbeat detection, but since it belongs to an ad tracking network, it's included in major ad-blocking lists. (If you're using Clash, also remember to add it to the whitelist.) Failing to do so may cause the ChatGPT web interface to display a greyed-out send button after clicking, with no response.

For users with Chinese IP addresses, consider adding this URL to the rules of your U.S. node, as the response headers from this site will report the user's physical location to GPT.
  • 3 replies
ยท
posted an update 7 months ago
view post
Post
6431
LLaMA-O1: Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace
Large Reasoning Models powered by Monte Carlo Tree Search (MCTS), Self-Play Reinforcement Learning, PPO, AlphaGo Zero's dua policy paradigm and Large Language Models!
https://github.com/SimpleBerry/LLaMA-O1/

What will happen when you compound MCTS โค LLM โค Self-Play โคRLHF?
Just a little bite of strawberry!๐Ÿ“

Past related works:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
  • 2 replies
ยท
posted an update 10 months ago
replied to their post 11 months ago
view reply

ๅŽ็ซฏๅผ‚ๅธธ๏ผŒๆŒ‚ๆމไบ†๏ผŒๅœจไฟฎๅค

posted an update 11 months ago
view post
Post
660
Preview:
We will open source the 2.5B ChemVLM and the tool-enhanced ChemLLM-7B in the near future
posted an update 12 months ago