![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
Articles d’actualité sur les crypto-monnaies
Co-citation based data augmentation for contrastive learning of scientific domains
Apr 29, 2025 at 02:08 pm
Data compilation
We used co-citations as a similarity heuristic to generate sufficiently large training datasets for contrastive learning over scientific domains. Our strategy enabled the production of large training datasets from small amounts of data due to the nonlinear scaling of citation graphs, as a single paper citing N other papers produces (N2) co-citation pairs. For context, a dataset of 10,000 individual papers can produce well over 125,000 co-citation pairs. While this measurement of similarity is not perfect, co-citations have generally been shown to imply a high degree of similarity between papers21. We assume for our modeling purposes that two co-cited papers are more similar than two random papers, even if they are from the same field.
To build our dataset, we randomly chose five biomedical subfields with little overlap. The domains of choice include papers related to cardiovascular disease (CVD), chronic obstructive pulmonary disease (COPD), parasitic diseases, autoimmune diseases, and skin cancers. PubMed Central was queried with Medical Subject Heading (MeSH) terms for each domain, requiring at least one citation and an abstract present between 2010 and 2022. This means that within the time period, we kept the co-citation pairs of the possible (N2) co-citations per paper that were returned from the same common MeSH terms. We sampled preferentially from samples co-cited more times when constructing our final dataset.
For evaluation, we constructed “negative” examples of abstract pairs that were not co-cited. The training dataset was split randomly in a 99:1 ratio followed by deduplication. We built negative pairs by pairing abstracts that had not been co-cited and had both been cited at least 15 times. This criteria allowed us to construct a representative evaluation set for binary classification with balanced classes, with 1’s for co-cited pairs and 0 if not. The exact dataset counts are outlined in Table 1.
Transformer neural networks
The transformer architecture is adept at sequential processing and is state-of-the-art for various natural language processing (NLP) and vision tasks24,25,26,27,28,29,30. A transformer block comprised a self-attention layer and multi-layer perception (MLP) interleaved with skip connections. Full transformers were made of T transformer blocks stacked together1.
Prior to the transformer blocks is the token embedding process, where tokenization maps an input string into a list of L integers from a dictionary. These integers served as the indices for a matrix We, where each row is a learnable representative vector for that token, making We∈ Rv×d where v is the total number of unique tokens in the vocabulary and d an arbitrarily chosen hidden dimension. The initial embedding is Rl×d.
Each block in the transformer then transforms this embedding, i.e., the i^{th} transformer block maps the embedding X(i-1) = [x1(i-1), ..., xL(i-1)]⊤∈Rl×d to X(i) = [x1(i), ..., xL(i)]⊤∈Rl × d1,31,32. X(T) is the last hidden state of the network. The first part of this map is self-attention, which mixes information across the vectors, followed by the MLP which mixes information across d31,33.
Including the MLP, the entire transformer block can be written as:
where b1 and b2 are biases associated with learned linear transformations W1 ∈ Rd×I and W2 ∈ RI×d, where I > d. The activation function σ, e.g., ReLU or GeLU, introduces non-linearity1. More recently, biases are not included, which improves training stability, throughput, and final performance. Additionally, improvements like SwiGLU activation functions and rotary positional embeddings are also commonly utilized3,4,34,35.
GPT (Generative Pretrained Transformer) models, such as OpenAI’s GPT series (GPT-3, GPT-4, etc.), are designed for generative tasks and use transformer decoders36,37,38. They employ causal (unidirectional) attention, meaning each token attends only to previous tokens in the sequence, enabling autoregressive generation during inference. This allows them to predict the next word in a sequence without direct access to future words.
In contrast, BERT models utilize transformer encoders with bidirectional attention, meaning they can attend to all tokens within an input simultaneously. This structure enables them to capture additional contextual dependencies, making them well-suited for tasks like text classification and sentence similarity39. Unlike GPT models, BERT is trained using a masked language modeling (MLM) objective, where some tokens are randomly hidden, requiring the model to predict them based on the surrounding context.
Mixture of Experts
Mixture of
Clause de non-responsabilité:info@kdj.com
Les informations fournies ne constituent pas des conseils commerciaux. kdj.com n’assume aucune responsabilité pour les investissements effectués sur la base des informations fournies dans cet article. Les crypto-monnaies sont très volatiles et il est fortement recommandé d’investir avec prudence après une recherche approfondie!
Si vous pensez que le contenu utilisé sur ce site Web porte atteinte à vos droits d’auteur, veuillez nous contacter immédiatement (info@kdj.com) et nous le supprimerons dans les plus brefs délais.
-
- 2025-W non circulé American Gold Eagle et Dr Vera Rubin Quarter Mark Nouveaux produits
- Jun 13, 2025 at 06:25 am
- Les États-Unis Mint ont publié des chiffres de vente pour ses produits numismatiques tout au long de la semaine se terminant le 8 juin, offrant les premiers résultats pour le nouvel aigle d'or américain non circulé à 50 $ à 50 $ et les derniers produits avec le quartier Dr Vera Rubin.
-
- Ruvi AI (RVU) exploite la blockchain et l'intelligence artificielle pour perturber le marketing, le divertissement et la finance
- Jun 13, 2025 at 07:05 am
- Tron a longtemps été un exemple brillant de la façon dont un projet de blockchain peut atteindre un succès remarquable en se concentrant sur sa mission et en offrant une croissance cohérente.
-
- Le groupe H100 AB augmente 101 millions de SEK (environ 10,6 millions de dollars) pour renforcer les réserves de Bitcoin
- Jun 13, 2025 at 06:25 am
- Dans une décision significative reflétant la convergence croissante de la technologie des soins de santé et de la finance numérique, la société suédoise-technique H100 Group AB a levé 101 millions de SEK (environ 10,6 millions de dollars) pour renforcer ses réserves de Bitcoin.
-
- Le PDG de Galaxy Digital, Mike Novogratz, dit que Bitcoin remplacera l'or et passera à 1 000 000 $
- Jun 13, 2025 at 06:45 am
- Aujourd'hui, le PDG de Galaxy Digital, Mike Novogratz, a déclaré à CNBC que Bitcoin était sur le chemin du remplacement de l'or et pourrait éventuellement atteindre une valeur de 1 000 000 $.
-
-
-
- La tempête de crypto se prépare à nouveau, alimentée par la confirmation par le président américain Donald Trump d'un accord commercial avec la Chine.
- Jun 13, 2025 at 07:45 am
- Une poignée de crypto-monnaies, notamment le bitcoin et les altcoins, présentent une force optimiste importante. Bitcoin, par exemple, s'est momentanément rallié au-dessus de 100 000 $
-
- Les intérêts ouverts à contrat à terme sont passés à des sommets de 2 ans à mesure que l'intérêt institutionnel augmente
- Jun 13, 2025 at 07:45 am
- Le Sol (Sol) de Solana n'a pas réussi à tenir son élan haussier après avoir gagné 10% entre lundi et jeudi. La crypto-monnaie a montré une faiblesse après avoir testé le niveau de 180 $ plusieurs fois en mai
-
- Sol Futures Open Interest augmente à des sommets de 2 ans alors que les investisseurs institutionnels montrent un intérêt croissant
- Jun 13, 2025 at 07:50 am
- Le Sol (Sol) de Solana n'a pas réussi à tenir son élan haussier après avoir gagné 10% entre lundi et jeudi. La crypto-monnaie a montré une faiblesse