![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
Nachrichtenartikel zu Kryptowährungen
Co-citation based data augmentation for contrastive learning of scientific domains
Apr 29, 2025 at 02:08 pm
Data compilation
We used co-citations as a similarity heuristic to generate sufficiently large training datasets for contrastive learning over scientific domains. Our strategy enabled the production of large training datasets from small amounts of data due to the nonlinear scaling of citation graphs, as a single paper citing N other papers produces (N2) co-citation pairs. For context, a dataset of 10,000 individual papers can produce well over 125,000 co-citation pairs. While this measurement of similarity is not perfect, co-citations have generally been shown to imply a high degree of similarity between papers21. We assume for our modeling purposes that two co-cited papers are more similar than two random papers, even if they are from the same field.
To build our dataset, we randomly chose five biomedical subfields with little overlap. The domains of choice include papers related to cardiovascular disease (CVD), chronic obstructive pulmonary disease (COPD), parasitic diseases, autoimmune diseases, and skin cancers. PubMed Central was queried with Medical Subject Heading (MeSH) terms for each domain, requiring at least one citation and an abstract present between 2010 and 2022. This means that within the time period, we kept the co-citation pairs of the possible (N2) co-citations per paper that were returned from the same common MeSH terms. We sampled preferentially from samples co-cited more times when constructing our final dataset.
For evaluation, we constructed “negative” examples of abstract pairs that were not co-cited. The training dataset was split randomly in a 99:1 ratio followed by deduplication. We built negative pairs by pairing abstracts that had not been co-cited and had both been cited at least 15 times. This criteria allowed us to construct a representative evaluation set for binary classification with balanced classes, with 1’s for co-cited pairs and 0 if not. The exact dataset counts are outlined in Table 1.
Transformer neural networks
The transformer architecture is adept at sequential processing and is state-of-the-art for various natural language processing (NLP) and vision tasks24,25,26,27,28,29,30. A transformer block comprised a self-attention layer and multi-layer perception (MLP) interleaved with skip connections. Full transformers were made of T transformer blocks stacked together1.
Prior to the transformer blocks is the token embedding process, where tokenization maps an input string into a list of L integers from a dictionary. These integers served as the indices for a matrix We, where each row is a learnable representative vector for that token, making We∈ Rv×d where v is the total number of unique tokens in the vocabulary and d an arbitrarily chosen hidden dimension. The initial embedding is Rl×d.
Each block in the transformer then transforms this embedding, i.e., the i^{th} transformer block maps the embedding X(i-1) = [x1(i-1), ..., xL(i-1)]⊤∈Rl×d to X(i) = [x1(i), ..., xL(i)]⊤∈Rl × d1,31,32. X(T) is the last hidden state of the network. The first part of this map is self-attention, which mixes information across the vectors, followed by the MLP which mixes information across d31,33.
Including the MLP, the entire transformer block can be written as:
where b1 and b2 are biases associated with learned linear transformations W1 ∈ Rd×I and W2 ∈ RI×d, where I > d. The activation function σ, e.g., ReLU or GeLU, introduces non-linearity1. More recently, biases are not included, which improves training stability, throughput, and final performance. Additionally, improvements like SwiGLU activation functions and rotary positional embeddings are also commonly utilized3,4,34,35.
GPT (Generative Pretrained Transformer) models, such as OpenAI’s GPT series (GPT-3, GPT-4, etc.), are designed for generative tasks and use transformer decoders36,37,38. They employ causal (unidirectional) attention, meaning each token attends only to previous tokens in the sequence, enabling autoregressive generation during inference. This allows them to predict the next word in a sequence without direct access to future words.
In contrast, BERT models utilize transformer encoders with bidirectional attention, meaning they can attend to all tokens within an input simultaneously. This structure enables them to capture additional contextual dependencies, making them well-suited for tasks like text classification and sentence similarity39. Unlike GPT models, BERT is trained using a masked language modeling (MLM) objective, where some tokens are randomly hidden, requiring the model to predict them based on the surrounding context.
Mixture of Experts
Mixture of
Haftungsausschluss:info@kdj.com
Die bereitgestellten Informationen stellen keine Handelsberatung dar. kdj.com übernimmt keine Verantwortung für Investitionen, die auf der Grundlage der in diesem Artikel bereitgestellten Informationen getätigt werden. Kryptowährungen sind sehr volatil und es wird dringend empfohlen, nach gründlicher Recherche mit Vorsicht zu investieren!
Wenn Sie glauben, dass der auf dieser Website verwendete Inhalt Ihr Urheberrecht verletzt, kontaktieren Sie uns bitte umgehend (info@kdj.com) und wir werden ihn umgehend löschen.
-
- 2025-W Uncirculed American Gold Eagle und Dr. Vera Rubin Quarter Mark Neue Produkte
- Jun 13, 2025 at 06:25 am
- Die United States Mint veröffentlichte die Verkaufszahlen für ihre numismatischen Produkte bis zum 8. Juni bis zum 8. Juni und bot die ersten Ergebnisse für den neuen 2025-W-$ 50 nicht umkreisenden American Gold Eagle und die neuesten Produkte mit dem Dr. Vera Rubin Quarter.
-
- Ruvi AI (RVU) nutzt Blockchain und künstliche Intelligenz, um Marketing, Unterhaltung und Finanzen zu stören
- Jun 13, 2025 at 07:05 am
- Tron ist seit langem ein leuchtendes Beispiel dafür, wie ein Blockchain -Projekt einen bemerkenswerten Erfolg erzielen kann, indem er sich auf seine Mission konzentriert und ein konsequentes Wachstum liefert.
-
- Die H100 -Gruppe AB erhöht 101 Millionen SEK (ca. 10,6 Millionen US -Dollar), um Bitcoin -Reserven zu stärken
- Jun 13, 2025 at 06:25 am
- In einem bedeutenden Schritt, der die zunehmende Konvergenz von Gesundheitstechnologie und digitalen Finanzen widerspiegelt, hat das schwedische Gesundheitsunternehmen H100 Group AB 101 Millionen SEK (ca. 10,6 Millionen US-Dollar) gesammelt, um seine Bitcoin-Reserven zu stärken.
-
- Mike Novogratz, CEO von Galaxy Digital, sagt, Bitcoin wird Gold ersetzen und 1.000.000 US -Dollar betragen
- Jun 13, 2025 at 06:45 am
- Der CEO von Galaxy Digital, Mike Novogratz, teilte CNBC heute mit, dass Bitcoin auf dem Weg ist, Gold zu ersetzen und letztendlich einen Wert von 1.000.000 US -Dollar erreichen könnte.
-
-
-
- Der Krypto -Sturm braut sich wieder zusammen und wird von der Bestätigung eines Handelsabkommens von US -Präsident Donald Trump mit China angeheizt.
- Jun 13, 2025 at 07:45 am
- Eine Handvoll Kryptowährungen, darunter Bitcoin und Altcoins, zeigen eine erhebliche stöhne Stärke. Bitcoin zum Beispiel versammelte sich momentan über 100.000 US -Dollar
-
- Sol Futures Open Interesse steigt auf 2-Jahres-Höchststände, wenn das institutionelle Interesse zunimmt
- Jun 13, 2025 at 07:45 am
- Solanas Sol (SOL) konnte seinen bullischen Dynamik nicht zwischen Montag und Donnerstag um 10% gewachsen. Die Kryptowährung hat nach dem Testen des 180 -Dollar -Levels im Mai eine Schwäche gezeigt
-