An official website of the United States government
Here's how you know
Official websites use .mil
A
.mil
website belongs to an official U.S. Department of Defense organization in the United States.
Secure .mil websites use HTTPS
A
lock (
lock
)
or
https://
means you’ve safely connected to the .mil website. Share sensitive information only on official, secure websites.
Skip to main content (Press Enter).
Toggle navigation
7th Army Training Command
7th ATC
Search 7th ATC:
Search
Search
Search 7th ATC:
Search
Home
About Us
Mission
Our History
Leadership
Staff
Chaplain
G1
G2
G3
G4
G6
G8
IG
MRA
Public Affairs
Protocol
SJA
SHARP
Directorates and Commands
Combined Arms Training Center
Grafenwoehr Training Area
International Special Training Centre
Joint Multinational Readiness Center
Joint Multinational Simulation Center
Noncommissioned Officer Academy
Training Support Activity Europe
Units
Joint Multinational Training Group-Ukraine
Georgian Defense Readiness Program-Training
HHC
Environment
Competitions
Europe Best Sniper Team
USAREUR-AF International Tank Challenge
USAREUR-AF Best Squad
European Best Medic Competition
Exercises
Allied Spirit
Combined Resolve
Saber Junction
Media & News
Uebungsbetrieb
Newcomers
A to Z
Contact Us
Home
:
Media & News
:
Video
DVIDSVideoPlayer
Playlist:
Search Results
Video by Kenneth M McNulty, Kevin D Schmidt
Player Embed Code:
Download
Embed
Share
Michael Robinson - Topological Features in Large Language Models (and beyond?)
Air Force Research Laboratory
Oct. 11, 2024 | 01:00:53
In this edition of QuEST, Michael Robinson will discuss topological features in large language models
Key Moments and Questions in the video include:
Acknowledgement of colleagues from DARPA and Galois
Manifolds in machine learning
LLM token space is higher dimensional
Manifold spaces tend to be negatively curved
LLM turn text into vectors
Transformers turn vectors into new text
How do we turn the text into vectors?
We think of LLM as being trained on all human language, but they have not
GPT2 Open source LLM as the source for model
ChatGPT2 used as the example
Tokens have topology and geometry
Words are a categorical variable
Vectors are a numerical variable
Mixing data types can lead to some problems
Why care about the token space?
Not all tokens correspond to a valid vector
Estimating dimensions
Volume of a sphere
Log of Volume vs log of radius curves
Ricci scalar curvature
Stratifications are visible
GPT2 uses a state space that is not a manifold
Dollar sign shown different in GPT2 because the $ is used in code where other currency symbols are not
GPT2’s 768 dimensions unwrapped using tSNE
Tokens with leading spaces
Beginnings of words show up in separate piece of low dimension
Visual similarity to hyperbolic plane
LLEMMA7B dimensions
Plotting dimension
Dark space are non-printing characters
Thinking about how neural activation patterns work
We have been thinking about manifold learning out of mathematical convenience
State spaces are not manifolds
Open presentation to conversation
More
Tags
quest
AFRL
ACT3
large language models
manifolds in machine learning
Artificial Horizon Indicator (AHI)
More
Up Next
1:19
DHA: A New Model of Care
1:41
New Model Of Care
1:54
New Care Initiative
0:17
Kayla Francisco Polaroid Reel
11:52
DoD CIO Cybersecurity Maturity Model Certification (CMMC) Program
Now Playing
Michael Robinson - Topological Features in Large Language Models (and beyond?)
0:30
Radio Spot - Model United Nations Cultural Night
1:39
My Military Health - DHA's New Model of Care
15:14
One Team Podcast: 2024 M&S and MBSE Summit
1:00
Air Force Radio News 10 June 2024
16:01
JATIC Capability Demonstration - April 2024
33:54
Conversations on Strategy Podcast – Ep 33 – Dr. C. Anthony Pfaff, COL Christopher J. Lowrance and Kristan Wheaton – On Artificial Intelligence
40:40
Cybersecurity Maturity Model Certification (CMMC) Proposed Rule Overview
7:25
ERDC Support for Saltwater Intrusion
1:00
Air Force Radio News 28 September 2023
More Videos