Abdel-Dayane Marcos's picture

Abdel-Dayane Marcos

admarcosai

·

AI & ML interests

Natural Language Processing, Graph Neural Networks, Reinforcement Learning

Organizations

None yet

admarcosai's activity

upvoted a collection 24 days ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 3 days ago • 137

upvoted 59 papers about 1 month ago

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 99

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1 • 19

Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 41

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2 • 53

ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

Paper • 2404.02893 • Published Apr 3 • 19

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

Paper • 2404.02575 • Published Apr 3 • 46

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4 • 15

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 74

Social Skill Training with Large Language Models

Paper • 2404.04204 • Published Apr 5 • 14

Stream of Search (SoS): Learning to Search in Language

Paper • 2404.03683 • Published Apr 1 • 21

CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4 • 21

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Paper • 2404.04125 • Published Apr 4 • 27

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 58

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Paper • 2404.05726 • Published Apr 8 • 18

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Paper • 2404.05014 • Published Apr 7 • 23

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

Paper • 2404.05717 • Published Apr 8 • 23

ByteEdit: Boost, Comply and Accelerate Generative Image Editing

Paper • 2404.04860 • Published Apr 7 • 24

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 57

CodecLM: Aligning Language Models with Tailored Synthetic Data

Paper • 2404.05875 • Published Apr 8 • 15

MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

Paper • 2404.06395 • Published Apr 9 • 18

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Paper • 2404.06512 • Published Apr 9 • 29

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Paper • 2404.05892 • Published Apr 8 • 28

OmniFusion Technical Report

Paper • 2404.06212 • Published Apr 9 • 73

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

Paper • 2404.06903 • Published Apr 10 • 14

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion

Paper • 2404.07199 • Published Apr 10 • 22

RULER: What's the Real Context Size of Your Long-Context Language Models?

Paper • 2404.06654 • Published Apr 9 • 32

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Paper • 2404.07143 • Published Apr 10 • 93

Sparse Laneformer

Paper • 2404.07821 • Published Apr 11 • 9

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11 • 15

Audio Dialogues: Dialogues dataset for audio and music understanding

Paper • 2404.07616 • Published Apr 11 • 14

HGRN2: Gated Linear RNNs with State Expansion

Paper • 2404.07904 • Published Apr 11 • 16

WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents

Paper • 2404.05902 • Published Apr 8 • 20

Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models

Paper • 2404.07973 • Published Apr 11 • 28

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 32

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 41

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Paper • 2404.07839 • Published Apr 11 • 39

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

Paper • 2404.07987 • Published Apr 11 • 46

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 80

Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

Paper • 2404.08197 • Published Apr 12 • 26

Pre-training Small Base LMs with Fewer Tokens

Paper • 2404.08634 • Published Apr 12 • 32

On Speculative Decoding for Multimodal Large Language Models

Paper • 2404.08856 • Published Apr 13 • 11

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

Paper • 2404.09833 • Published Apr 15 • 27

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Paper • 2404.09967 • Published Apr 15 • 20

Compression Represents Intelligence Linearly

Paper • 2404.09937 • Published Apr 15 • 27

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 42

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15 • 80

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12 • 62

Scaling Instructable Agents Across Many Simulated Worlds

Paper • 2404.10179 • Published Mar 13 • 23

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Paper • 2404.12241 • Published Apr 18 • 10

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Paper • 2404.11912 • Published Apr 18 • 16

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Paper • 2404.12387 • Published Apr 18 • 35

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17 • 40

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 51

PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

Paper • 2404.13026 • Published Apr 19 • 21

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19 • 38

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

Paper • 2404.14396 • Published Apr 22 • 17

FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17 • 32

A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22 • 19