Stacking Sats Smarter: Enhancing DCA with Polymarket Signals
--
Anton Pash — Georgia Tech OMSA Bitcoin Analytics Practicum
A study on how prediction market signals from Polymarket can improve traditional Bitcoin dollar-cost averaging strategies.
Introduction
This article analyzes Bitcoin trading data alongside Polymarket prediction market data to extract insights for improving Bitcoin accumulation strategies. The goal is to enhance a baseline Dollar-Cost Averaging (DCA) approach by incorporating trading signals derived from Polymarket as a proxy for broader market sentiment, while preserving the simplicity and consistency of systematic investing.
Bitcoin, as a non-asset-backed digital currency with a fixed supply of 21 million coins, is highly sensitive to market perception and collective trust. This makes sentiment-driven signals potentially valuable for optimizing entry timing beyond purely mechanical investment strategies.
With the emergence of prediction markets such as Polymarket, it is now possible to better quantify crowd-based expectations on Bitcoin-related outcomes. This creates an opportunity to integrate behavioral signals into traditional crypto accumulation strategies.
This research explores whether such sentiment signals can meaningfully improve DCA performance. The following sections outline the data sources, exploratory analysis, methodology, model evaluation, and key findings.
Data Sources
For this analysis, we use two datasets provided by the project team. The first is the BTC coin metrics dataset, which contains Bitcoin price data and related financial indicators over the period from January 2009 to January 2026.
The second dataset consists of Polymarket trade data, compiled from multiple sources and spanning January 2023 to January 2026. This dataset is used to capture prediction market activity that may reflect broader sentiment around Bitcoin-related outcomes.
Methodology Design
To identify an effective approach for combining Bitcoin financial data with Polymarket market signals, the analysis follows a structured multi-stage process:
- We first conduct exploratory data analysis (EDA) on both the Bitcoin dataset and the Polymarket dataset to understand their structure, distributions, and potential relationships relevant to Bitcoin price dynamics and sentiment signals.
- Based on the findings from the EDA stage, we identify and select key variables that may be informative in predicting optimal Bitcoin purchase timing.
- We then evaluate a range of modelling approaches using these selected variables in order to determine the most effective framework for integrating Polymarket-derived signals into a Bitcoin accumulation strategy.
- Finally, the selected model is implemented and tested, and the resulting performance is analyzed to assess the effectiveness of the proposed approach.
Exploratory Data Analysis
Bitcoin Data
Figure 1 shows the historical Bitcoin price series, highlighting a strong upward trend with significant volatility, particularly after 2021. During this period, prices fluctuate between below $20,000 and above $120,000, indicating increasing market sensitivity and larger amplitude price swings over time.
This increase in volatility is important as it suggests that more recent data may contain stronger signals for short-term dynamics, but also higher noise.
Figure 2 presents a decomposition of the Bitcoin price series into trend, seasonal, and residual components. The trend component captures the long-term upward movement, while the seasonal component shows repeating short-term patterns.
However, the residual component still exhibits residual structure, indicating that the decomposition does not fully capture all underlying information. This is further supported by the ACF and PACF plots presented below. In particular, the ACF displays a slow decay while the PACF shows a pronounced spike at lag one, consistent with an AR(1) process.
To address this, first-order differencing is applied, which improves the stationarity properties of the series. Following this transformation, no meaningful autoregressive or moving average structure remains, and the process closely resembles a random walk. Consequently, the residual dynamics are best represented by an ARIMA(0,1,0) specification.
Fitting an ARIMA(0,1,0) model to the decomposed residuals and generating a 30-day forecast yields moderately strong initial results, with a mean absolute error of approximately $1,087, corresponding to an error rate of roughly 1% relative to the Bitcoin price level.
However, a visual inspection of the forecast in Figure 4 below indicates that the model behaves primarily as a lagging indicator, closely tracking recent price movements rather than providing meaningful predictive power. This is consistent with the reliance on a simple random-walk, rather than a more complex autoregressive or moving average structure which could be exploited.
Now looking at the correlation matrix in Figure 5 below, several notable patterns emerge. First, Market Cap and Price (USD) exhibit an almost perfect correlation in magnitude, which is expected given that market capitalization is mechanically derived from price and circulating supply. As a result, this relationship provides limited additional predictive value, since it does not introduce independent information beyond the underlying price dynamics.
In contrast, Hash Rate and Transaction Count show strong positive correlations with Bitcoin price variables, approximately 0.93 and 0.66 respectively. These relationships suggest that both network activity and computational power may serve as potential features for an accumulation strategy.
However, the similarity of these correlation values across different lag structures raises concerns about whether these variables provide truly incremental information, or whether they simply reflect contemporaneous market conditions rather than predictive signals.
Polymarket Data
The Polymarket dataset is filtered to include only Bitcoin-related markets by selecting questions containing the keyword “bitcoin” (case-insensitive). To avoid duplication across markets, only the “Yes” side of each contract is retained. It is important to note that Bitcoin-specific Polymarket data is only available from May 2023 onwards, compared to the full dataset starting in January 2023. This slightly reduces the available historical window and limits the ability to extract long-term insights from the data.
Examining trading volume for Bitcoin-related Polymarket markets in Figure 6, the data exhibits periods of relatively stable activity interspersed with sharp spikes, most notably around January 2025. These spikes likely correspond to periods of heightened interest or major Bitcoin-related events. While trading volume does not provide directional information regarding market sentiment, it can serve as a proxy for shifts in public attention and engagement.
Further analysis focuses on sentiment derived from Polymarket trade prices. As shown in Figure 7, the aggregated daily sentiment exhibits frequent short-term fluctuations, with occasional periods of increased volatility. However, the overall behavior appears relatively stationary, suggesting limited persistence in directional sentiment over time.
This indicates that while Polymarket sentiment may capture short-term changes in market perception, it is unlikely to provide strong predictive power in isolation. Therefore, in the proposed Bitcoin accumulation strategy, both trading volume and sentiment measures are incorporated as complementary signals alongside traditional financial indicators.
Modelling
Full Feature Model
The full feature model is designed to predict the directional movement of Bitcoin prices on the following day (i.e., whether the price will increase or decrease). This prediction is intended to enhance a baseline DCA strategy by identifying more favorable entry points when the price falls below its moving average.
Several modelling approaches were evaluated, including Random Forest and Logistic Regression. However, neither model produced satisfactory predictive performance when applied independently. To address this, an ensemble approach combining both methods was implemented in an attempt to leverage their complementary strengths.
Despite incorporating a broad set of features derived from the EDA, the resulting model achieved a predictive accuracy of approximately 0.485, which is effectively equivalent to random guessing. Additionally, the predicted probabilities exhibited limited dispersion, with many outputs clustering near zero regardless of the actual outcome.
Further analysis indicates that adjusting the classification threshold is unlikely to materially improve performance, as the model fails to generate sufficiently informative probability estimates. Overall, these results suggest that the full feature model does not provide meaningful predictive power for short-term Bitcoin price direction.
Enhanced DCA Model
Given that the full feature ensemble model did not produce sufficient predictive performance, the approach shifts from price prediction to strategy enhancement. Specifically, the baseline Dollar-Cost Averaging (DCA) model is extended by incorporating Polymarket-derived signals.
Among the features explored, Polymarket trading volume showed the most promise, particularly in capturing spikes in market attention surrounding Bitcoin-related events. Rather than attempting to predict price direction, this signal is used to dynamically adjust the aggressiveness of purchases. Once the baseline DCA framework identifies a buying condition, Polymarket volume is used to scale the allocation size, with higher volume indicating increased market interest.
The resulting strategy achieves a win rate of approximately 56% over the test period from May 2023 to December 2025. While the directional prediction accuracy remains relatively low at 47%, the model still outperforms random allocation by approximately 6%. This highlights an important distinction between prediction accuracy and strategy performance.
In terms of cumulative returns, the enhanced strategy demonstrates strong outperformance relative to the baseline benchmark, with consistent growth beginning in early 2024, as illustrated in the figure below.
Findings & Conclusion
This report explored methods to improve a baseline Dollar-Cost Averaging (DCA) strategy for Bitcoin allocation by incorporating additional financial and market-based signals, including Polymarket data as a proxy for market sentiment. Through exploratory data analysis (EDA), several candidate variables were identified, including both Bitcoin financial metrics and Polymarket-derived features.
However, the analysis highlights a key challenge in financial modeling: short-term price movements are inherently difficult to predict. The Bitcoin time series exhibits behavior consistent with a random walk, similar to other financial assets. Multiple modelling approaches were tested, including Random Forest, Logistic Regression, and an ensemble of both. Despite incorporating a broad set of features, the resulting models failed to achieve meaningful predictive power, with accuracy levels at or below that of random guessing.
Given these results, the approach shifted from prediction to strategy enhancement. The baseline DCA model was retained for its robustness and simplicity, while Polymarket trading volume was introduced as an additional signal. Among the features considered, trading volume provided the clearest signal-to-noise ratio and served as a proxy for market attention.
This signal was used to dynamically adjust the aggressiveness of capital allocation within the DCA framework. Rather than predicting price direction, the model increases or decreases purchase size based on observed market activity, while maintaining the core structure of systematic accumulation.
The enhanced strategy achieved a 56% win rate and generated consistent cumulative outperformance relative to the baseline DCA model over the test period. Notably, this improvement was achieved despite low predictive accuracy, reinforcing the distinction between prediction and strategy performance.
Overall, the results demonstrate that incorporating market-derived signals into a simple, rule-based framework can provide a meaningful advantage without relying on complex or opaque models. In practice, such approaches may be preferable in financial applications, where robustness, interpretability, and consistency often outweigh marginal gains from highly complex “black-box” models.