Cryptocurrency Price Forecasting using XGBoost Regressor And Technical Indicators > 자유게시판

본문 바로가기

자유게시판

Cryptocurrency Price Forecasting using XGBoost Regressor And Technical…

페이지 정보

profile_image
작성자 Kaley Read
댓글 0건 조회 27회 작성일 25-01-21 07:05

본문

The speedy progress of the stock market has attracted many investors because of its potential for vital profits. However, predicting stock prices precisely is tough as a result of financial markets are advanced and always altering. That is especially true for the cryptocurrency market, which is known for its extreme volatility, making it difficult for traders and traders to make smart and profitable decisions. This study introduces a machine learning approach to foretell cryptocurrency costs. Specifically, we make use of important technical indicators akin to Exponential Moving Average (EMA) and Moving Average Convergence Divergence (MACD) to train and feed the XGBoost regressor model. We demonstrate our strategy through an analysis specializing in the closing costs of Bitcoin cryptocurrency. We consider the model’s efficiency via numerous simulations, displaying promising outcomes that counsel its usefulness in aiding/guiding cryptocurrency traders and investors in dynamic market situations.

Index Terms:

I Introduction

Over the past few years, the speedy expansion of the inventory market has made it an interesting choice for traders searching for high returns and quick access. However, investing in stocks carries inherent dangers, underscoring the necessity for a properly-outlined funding strategy. Traditionally, traders relied on empirical methods resembling technical evaluation, guided by monetary experience. With the widespread adoption of financial expertise (FinTech), statistical fashions incorporating machine learning methods have emerged for forecasting inventory price movements. This shift has demonstrated important success throughout various markets, including the S&P 500, NASDAQ [1], and the cryptocurrency market [2, 3]. In this research, our emphasis is on the cryptocurrency market, a dynamic power in finance, with a specific focus on Bitcoin price prediction [4].

Furthermore, Blockchain technology, the spine of cryptocurrencies, has gained substantial attention in the banking and monetary trade as a consequence of its safe and clear decentralized database [5]. Despite the advantages of ample market information and continuous trading, the cryptocurrency market faces challenges corresponding to high price volatility and comparatively smaller capitalization. Success in cryptocurrency financial buying and selling hinges on the careful analysis and choice of data, making the development of machine learning fashions crucial for extracting meaningful insights. Models equivalent to Long Short Term Memory (LSTM) and Random Forest (RF) are instrumental in predicting cryptocurrency costs by leveraging historic data and patterns, article, more, read here thereby aiding efficient resolution-making on this volatile market. Despite the potential, there have been limited studies attempting to create profitable buying and selling methods in the cryptocurrency market.

With the appearance of FinTech, machine studying models have been increasingly adopted to forecast stock price movements, reworking the landscape of financial evaluation and buying and selling. These fashions leverage large datasets and complex algorithms to identify patterns and predict future value developments, which has led to notable success across varied markets, together with the S&P 500 and NASDAQ [1]. Within the cryptocurrency market, which is characterized by its high volatility and rapid price fluctuations, machine studying methods have proven particularly valuable. Studies have demonstrated the efficacy of deep studying strategies, akin to Stacked Denoising Autoencoders (SDAE) and LSTM networks, in predicting Bitcoin costs with high accuracy [2, 3]. These models utilize a variety of inputs, together with historical worth information, buying and selling quantity, public sentiment, and macroeconomic indicators, to generate predictions that can information funding decisions. The integration of machine studying into FinTech has thus provided buyers with highly effective instruments to navigate the complexities of monetary markets, enhancing their skill to make informed and strategic trading choices.

Despite some great benefits of the cryptocurrency market, akin to ample market knowledge and steady trading, it faces vital challenges like excessive price volatility and relatively smaller capitalization. Successful buying and selling on this market is determined by careful information evaluation and choice, making the development of machine studying fashions crucial for extracting meaningful insights. Models like LSTM and RF are instrumental in predicting cryptocurrency costs by utilizing historical information and patterns, thus aiding efficient resolution-making on this risky panorama. While there have been restricted studies on creating successful buying and selling methods in the cryptocurrency market, our research aims to bridge this gap by introducing a novel machine studying technique using the XGBoost regressor model, which incorporates essential technical indicators and historical data to enhance financial buying and selling strategies.

This analysis introduces an efficient machine learning method for forecasting cryptocurrency costs, specifically specializing in Bitcoin. The motivation behind this research stems from the inherent volatility and complexity of the cryptocurrency market, which pose important challenges for traders and buyers. Traditional strategies of technical analysis and empirical strategies are sometimes insufficient in predicting value movements in such a dynamic atmosphere. To handle this, we suggest utilizing the XGBoost regressor mannequin, a powerful machine learning method identified for its robustness and accuracy. Our methodology integrates a complete set of technical indicators, including the Exponential Moving Average (EMA), Moving Average Convergence Divergence (MACD), Relative Strength Index (RSI), and different related metrics derived from historical market information. The info is sourced from Binance via its API, protecting an in depth time span with high-frequency intervals, which allows for capturing fast market changes.

The proposed model undergoes intensive preprocessing and have engineering to boost its predictive capabilities. By employing regularization techniques, we mitigate the risk of overfitting and fantastic-tune the model parameters by way of a grid seek for optimal performance. Our results display that the XGBoost regressor mannequin significantly improves prediction accuracy, evidenced by low Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) values, in addition to a near-excellent R-squared value. This study contributes to the state-of-the-art by providing a robust and scalable resolution for cryptocurrency worth prediction, leveraging advanced machine learning methods to navigate the complexities of monetary markets and aiding in informed determination-making for traders and buyers.

The important thing contributions of this paper are summarized as follows:

- • Introduce an efficient machine studying strategy utilizing the XGBoost regressor model for cryptocurrency price prediction.

- • Integrate a complete set of technical indicators, together with EMA, MACD, RSI, and others, with historic market data.

- • Employ regularization techniques to mitigate overfitting and advantageous-tuned mannequin parameters by grid search.

- • Demonstrate important improvements in prediction accuracy with low MAE, RMSE, and a close to-perfect R-squared value.

- • Provide a sturdy and scalable answer for navigating the complexities of monetary markets, aiding informed determination-making for traders and buyers.

The rest of the paper is organized as follows. Section II critiques probably the most relevant present works. Section III explains how we collected and prepared the info. Section IV proposes the machine learning mannequin and its mathematical formulation. In Section V, we evaluate/assess our proposed model. Section V additionally offers a comparison between the proposed work and existing studies in the literature. Finally, Section VI concludes the paper.

II Related Work

The effort to forecast cryptocurrency prices has garnered important curiosity lately, resulting in the event of various methods to address this complicated downside [6]. This section evaluations superior studies that employ machine learning for predicting cryptocurrency costs, with a selected concentrate on Bitcoin because of its dominant place and the intensive availability of data.

Among these advancements, machine studying has considerably impacted cryptocurrency worth forecasting by offering models that adeptly navigate the advanced and volatile digital forex market [7]. These methods vary from easy regression fashions to advanced deep studying networks, each able to detecting patterns and predicting future prices based on historical knowledge [8].

Cryptocurrency value fluctuations are influenced by numerous factors, which has prompted the adoption of machine studying for price prediction [9, 10]. As an example, research by Greaves and AU [11] have investigated utilizing community attributes and machine learning to foretell Bitcoin costs. Similarly, Jang and Lee [12] combined blockchain-associated options, time collection analysis, and Bayesian neural networks (BNNs) for Bitcoin price analysis.

Building on this basis, further research by [13], [14], and [15] has applied machine learning to Bitcoin worth forecasting. Saad et al. [15] not only predicted prices but in addition identified vital network attributes and consumer behaviors influencing price variations in Bitcoin and Ethereum [16], alongside the provision and demand dynamics of cryptocurrencies. Additionally, Sin and Wang [17] utilized neural networks for price predictions, leveraging blockchain knowledge features.

Continuing this development, Christoforou et al. [18] developed a Bitcoin value prediction mannequin utilizing neural networks, focusing on elements affecting price volatility and using blockchain data and network activity metrics for forecasting. Furthermore, Chen et al. [19] and Akyildirim et al. [20] demonstrated the appliance of machine learning in forecasting Bitcoin prices and mid-worth movement of Bitcoin futures, respectively. These studies highlight the ability of machine learning to harness huge datasets and establish advanced patterns, enhancing predictive accuracy beyond traditional statistical approaches.

Moreover, some studies have demonstrated the effectiveness of combining machine studying techniques with blockchain data for cryptocurrency price forecasting. For instance, Martin et al. [21] launched a hybrid method that merges diverse data and analytical methods, enhancing accuracy on this complicated subject. Liu et al. [22] centered on optimizing performance and interpretability in monetary time sequence, showcasing the advantages of combining various machine studying approaches. He et al. [23] developed a deep learning ensemble mannequin for financial time sequence forecasting, relevant to cryptocurrencies, illustrating the increased reliability and accuracy of a number of deep learning [24] strategies.

Additionally, Nazareth and Reddy [8] reviewed machine studying in finance, highlighting hybrid models’ effectiveness in handling monetary market complexities. Further analysis by Nagula and Alexakis [25], Petrovic et al. [26], Gupta and Nalavade [27], and Luo et al. [28] underscores the success of numerous computational methods in bettering Bitcoin price predictions, advancing refined, correct models for cryptocurrency investments.

In conclusion, machine learning not only excels in predictive accuracy but also in adaptability and scalability, each of which are essential because the cryptocurrency market evolves. With the capacity to replace models with new information, machine studying stays an important tool for cryptocurrency buying and selling and funding, guaranteeing timely and exact forecasts [19, 20].

Unlike existing studies, our work introduces a novel machine learning strategy that leverages the XGBoost regressor mannequin, combining a variety of technical indicators such as EMA and MACD with historical data for Bitcoin worth prediction. This strategy emphasizes the use of regularization techniques to prevent overfitting and effective-tuning mannequin parameters for enhanced accuracy. Our methodology stands out by effectively integrating numerous datasets and analytical strategies, ensuring sturdy and precise predictions in the highly unstable cryptocurrency market.

III Data

This part covers the essential notations and abbreviations, explains the info assortment course of, particulars the preprocessing steps, and discusses the engineering of extra features. Table I provides the definitions of the parameters and abbreviations used on this paper.

III-An information Collection and Preprocessing

We obtained Bitcoin historical market knowledge from Binance through the Binance API [29]. The dataset spans from February 1, 2021, to February 1, 2022, with a time interval of 15 minutes (Δt=15Δ?15\Delta t=15roman_Δ italic_t = quarter-hour). This interval was chosen for its stability between capturing detailed market fluctuations and maintaining accuracy. The data is cut up into 80% for the coaching set and 20% for the testing set. The choice of a shorter time interval is particularly necessary due to the excessive volatility of the Bitcoin market, the place fast changes are frequent. In such highly volatile markets, shorter intervals are important for accurately capturing these swift worth movements, in contrast to in much less volatile markets the place longer intervals may suffice.

Figure 1 illustrates the Bitcoin close price over time in USD. The x-axis represents dates, whereas the y-axis represents the worth in USD. The plot supplies a visual representation of the fluctuation in Bitcoin’s closing value over the observed interval, enabling insights into the cryptocurrency’s price trend and volatility.

We benefit from StandardScaler from sklearn.preprocessing module to scale the data. Let’s denote the elements of the matrix X?Xitalic_X as xijsubscript???x_ijitalic_x begin_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT, where i?iitalic_i represents the row index (sample) and j?jitalic_j represents the column index (function). The transformation applied by the StandardScaler to every function j?jitalic_j is outlined as follows:

1. 1. Compute the imply (μj=1m∑i=1mxijsubscript??1?superscriptsubscript?1?subscript???\mu_j=\frac1m\sum_i=1^mx_ijitalic_μ start_POSTSUBSCRIPT italic_j finish_POSTSUBSCRIPT = divide start_ARG 1 end_ARG begin_ARG italic_m finish_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT begin_POSTSUPERSCRIPT italic_m finish_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i italic_j finish_POSTSUBSCRIPT) and customary deviation (σj=1m∑i=1m(xij−μj)2subscript??1?superscriptsubscript?1?superscriptsubscript???subscript??2\sigma_j=\sqrt\frac1m\sum_i=1^m(x_ij-\mu_j)^2italic_σ begin_POSTSUBSCRIPT italic_j finish_POSTSUBSCRIPT = square-root begin_ARG divide start_ARG 1 end_ARG begin_ARG italic_m finish_ARG ∑ begin_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x begin_POSTSUBSCRIPT italic_i italic_j finish_POSTSUBSCRIPT - italic_μ begin_POSTSUBSCRIPT italic_j finish_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT finish_ARG) of characteristic j?jitalic_j, the place m?mitalic_m is the number of samples (rows), xijsubscript???x_ijitalic_x begin_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the aspect at the i?iitalic_i-th row and j?jitalic_j-th column of X?Xitalic_X.

2. 2. Apply the transformation to every component of feature j?jitalic_j:

xij′=xij−μjσjsubscriptsuperscript?′??subscript???subscript??subscript??x^\prime_ij=\fracx_ij-\mu_j\sigma_jitalic_x begin_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT begin_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = divide start_ARG italic_x begin_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT - italic_μ start_POSTSUBSCRIPT italic_j finish_POSTSUBSCRIPT finish_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG

the place xij′subscriptsuperscript?′??x^\prime_ijitalic_x begin_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i italic_j finish_POSTSUBSCRIPT is the scaled worth of xijsubscript???x_ijitalic_x start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT.

III-B Feature Engineering

On this section, we elaborate on the varied features integrated in this case examine, using both historic market information and technical indicators.

III-B1 Historical Data

In historic information analysis, we make the most of varied metrics to grasp the habits of Bitcoin prices inside particular time durations. These metrics embody:

- • Open price (?psubscript??\mathcalO_pcaligraphic_O start_POSTSUBSCRIPT italic_p finish_POSTSUBSCRIPT): The initial value of Bitcoin at first of a particular time interval.

- • Highest worth (ℋpsubscriptℋ?\mathcalH_pcaligraphic_H start_POSTSUBSCRIPT italic_p finish_POSTSUBSCRIPT): The utmost value of Bitcoin recorded throughout a time period.

- • Lowest worth (ℒpsubscriptℒ?\mathcalL_pcaligraphic_L start_POSTSUBSCRIPT italic_p finish_POSTSUBSCRIPT): The minimum value of Bitcoin recorded during a time period.

- • Close price (?psubscript??\mathcalC_pcaligraphic_C start_POSTSUBSCRIPT italic_p finish_POSTSUBSCRIPT): The final value of Bitcoin at the top of a time period.

- • Trading volume (??\mathcalVcaligraphic_V): The whole variety of Bitcoin traded within a time interval.

- • Quote Asset Volume (QAV): The entire trading value of Bitcoin inside a time interval.

- • Variety of Trades (NOT): The overall number of trades executed during a time interval.

- • Total Buy Base Volume (TBBV): The whole quantity of Bitcoin purchased throughout a time interval.

- • Total Buy Quote Volume (TBQV): The whole value of Bitcoin purchased during a time period.

III-B2 Technical Indicators

Technical evaluation indicators characterize a trading self-discipline utilized to assess investments and pinpoint trading opportunities via the evaluation of statistical traits derived from buying and selling actions, together with worth movements and quantity [30]. In this examine, we discover indicators to feed our machine studying model, comparable to EMA, MACD, relative energy index, momentum, worth price of change, and stochastic oscillator.

We employ EMA with completely different intervals, where EMA10subscriptEMA10\textual contentEMA_10EMA begin_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT, EMA30subscriptEMA30\textual contentEMA_30EMA begin_POSTSUBSCRIPT 30 end_POSTSUBSCRIPT, and EMA200subscriptEMA200\textual contentEMA_200EMA start_POSTSUBSCRIPT 200 finish_POSTSUBSCRIPT symbolize the typical value of Bitcoin over the last 10, 30, and 200 periods, respectively. To measure the magnitude of latest value modifications and evaluate overbought or oversold situations, we use RSI. Specifically, RSI10subscriptRSI10\textual contentRSI_10RSI start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT, RSI14subscriptRSI14\textual contentRSI_14RSI start_POSTSUBSCRIPT 14 end_POSTSUBSCRIPT, RSI30subscriptRSI30\textual contentRSI_30RSI begin_POSTSUBSCRIPT 30 finish_POSTSUBSCRIPT, and RSI200subscriptRSI200\textual contentRSI_200RSI begin_POSTSUBSCRIPT 200 finish_POSTSUBSCRIPT assess price modifications over 10, 14, 30, and 200 durations, respectively. In addition, we apply Momentum (Mom) indicators to gauge the rate of change in Bitcoin costs, with MOM10subscriptMOM10\textMOM_10Mom start_POSTSUBSCRIPT 10 finish_POSTSUBSCRIPT and MOM30subscriptMOM30\textMOM_30Mom start_POSTSUBSCRIPT 30 finish_POSTSUBSCRIPT reflecting changes over the last 10 and 30 periods, respectively.

Furthermore, we incorporate MACD, a development-following momentum indicator that illustrates the connection between two moving averages of Bitcoin costs. Additionally, we use %K10, %K30, and %K200 as parts of the stochastic oscillator, which examine the current worth of Bitcoin to its price vary over the past 10, 30, and 200 periods, respectively. Finally, we include the proportion Rate of Change with 9 durations (PROC9subscriptPROC9\textual contentPROC_9PROC start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT), measuring the proportion change in Bitcoin costs over the past 9 durations.

IV Methodology

This part details the proposed methodology for our machine learning strategy to cryptocurrency price forecasting. Algorithm 1 outlines our machine studying strategy for cryptocurrency price forecasting using the XGBoost regressor mannequin combined with numerous technical indicators comparable to EMA, MACD, RSI, and extra. The process consists of data assortment and preprocessing, feature engineering, model coaching with hyperparameter tuning, and model evaluation. Details of this methodology are discussed in the next subsections.

Let (x(i),y(i))superscript??superscript??(x^(i),y^(i))( italic_x start_POSTSUPERSCRIPT ( italic_i ) finish_POSTSUPERSCRIPT , italic_y begin_POSTSUPERSCRIPT ( italic_i ) finish_POSTSUPERSCRIPT ) denotes a single pattern/statement, and the set of samples is represented by:

where x(i)∈ℝnsuperscript??superscriptℝ?x^(i)\in\mathbbR^nitalic_x start_POSTSUPERSCRIPT ( italic_i ) finish_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and y(i)=?p(i)superscript??superscriptsubscript???y^(i)=\mathcalC_p^(i)italic_y begin_POSTSUPERSCRIPT ( italic_i ) finish_POSTSUPERSCRIPT = caligraphic_C begin_POSTSUBSCRIPT italic_p finish_POSTSUBSCRIPT begin_POSTSUPERSCRIPT ( italic_i ) finish_POSTSUPERSCRIPT.

Considering both technical indicators and historical information for price prediction necessitates the integration of various datasets. To attain this, we combine technical indicators and historic information as inputs to our model. The characteristic vector at a given time t?titalic_t may be expressed as follows:

To extend the generality of our model, we stack all feature vectors into a matrix ??\mathbfXbold_X, which might be expressed as follows:

The output matrix can then be expressed as follows:

In this case study, the issue is to attenuate the cost function for XGBoost regressor, which is a regularized finite-sum minimization drawback outlined as:

Where:

- • ΘΘ\Thetaroman_Θ represents the set of parameters to be realized during coaching.

- • L(yi,y^i)?subscript??subscript^??L(y_i,\haty_i)italic_L ( italic_y begin_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over^ begin_ARG italic_y finish_ARG start_POSTSUBSCRIPT italic_i finish_POSTSUBSCRIPT ) is the loss perform that measures the distinction between the true target value yisubscript??y_iitalic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and the predicted target balue y^isubscript^??\haty_iover^ start_ARG italic_y finish_ARG begin_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for the i?iitalic_i-th occasion. Within the context of this case research, we employ the imply squared error (MSE) loss operate, which is expressed as follows:

L(yi,y^i)=∑i=1n(yi−y^i)2?subscript??subscript^??superscriptsubscript?1?superscriptsubscript??subscript^??2L(y_i,\haty_i)=\sum_i=1^n(y_i-\haty_i)^2italic_L ( italic_y start_POSTSUBSCRIPT italic_i finish_POSTSUBSCRIPT , over^ begin_ARG italic_y finish_ARG start_POSTSUBSCRIPT italic_i finish_POSTSUBSCRIPT ) = ∑ begin_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n finish_POSTSUPERSCRIPT ( italic_y begin_POSTSUBSCRIPT italic_i finish_POSTSUBSCRIPT - over^ begin_ARG italic_y finish_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) begin_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (4)

Here, yisubscript??y_iitalic_y begin_POSTSUBSCRIPT italic_i finish_POSTSUBSCRIPT is the true target value for sample i?iitalic_i, and y^isubscript^??\haty_iover^ begin_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the predicted goal value for sample i?iitalic_i.

- • ℛ(fk)ℛsubscript??\mathcalR(f_k)caligraphic_R ( italic_f begin_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) represents the regularization term for every tree to manage its complexity. It typically consists of each L1subscript?1L_1italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and L2subscript?2L_2italic_L begin_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT regularization. Assuming T?Titalic_T is the variety of leaves in tree fksubscript??f_kitalic_f begin_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and wj,ksubscript???w_j,kitalic_w begin_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT is the burden for leaf j?jitalic_j in tree fksubscript??f_kitalic_f begin_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, the regularization time period for tree fksubscript??f_kitalic_f begin_POSTSUBSCRIPT italic_ok finish_POSTSUBSCRIPT is:

ℛ(fk)=γT+12λ∑j=1Twj,k2+α∑j=1T|wj,k|ℛsubscript????12?superscriptsubscript?1?superscriptsubscript???2?superscriptsubscript?1?subscript???\mathcalR(f_k)=\gamma T+\frac12\lambda\sum_j=1^Tw_j,ok^2+\alpha% \sum_j=1^T|w_j,okay|caligraphic_R ( italic_f begin_POSTSUBSCRIPT italic_ok end_POSTSUBSCRIPT ) = italic_γ italic_T + divide start_ARG 1 end_ARG start_ARG 2 finish_ARG italic_λ ∑ begin_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT begin_POSTSUPERSCRIPT italic_T finish_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_j , italic_ok finish_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_α ∑ begin_POSTSUBSCRIPT italic_j = 1 finish_POSTSUBSCRIPT begin_POSTSUPERSCRIPT italic_T finish_POSTSUPERSCRIPT | italic_w start_POSTSUBSCRIPT italic_j , italic_okay finish_POSTSUBSCRIPT | (5)

The regularization terms (ℛ(fk)ℛsubscript??\mathcalR(f_k)caligraphic_R ( italic_f begin_POSTSUBSCRIPT italic_okay finish_POSTSUBSCRIPT )) help management the complexity of particular person timber within the ensemble, stopping overfitting.

During coaching, XGBoost regressor goals to seek out the set of parameters (ΘΘ\Thetaroman_Θ) that minimizes the general price operate. The optimization is usually carried out utilizing strategies like gradient boosting, which involves iteratively adding weak learners to the ensemble to reduce the residual errors [31].

Table II presents a parameter grid utilized in GridSearchCV, a way for hyperparameter tuning in machine learning fashions. Hyperparameters are predefined settings that management the training process of algorithms. The table lists various hyperparameters generally used within the XGBoost regressor model, a preferred gradient boosting framework [31]. Each hyperparameter is accompanied by its corresponding values which are explored in the course of the grid search course of. For instance, N?Nitalic_N represents the variety of estimators (trees) in the XGBoost mannequin, with values of 300 and 400 being considered. Similarly, η?\etaitalic_η denotes the educational rate, with potential values of 0.01, 0.1, and 0.2.

Other hyperparameters embrace Dmaxsubscript?maxD_\textmaxitalic_D begin_POSTSUBSCRIPT max end_POSTSUBSCRIPT for maximum depth of bushes, Wminsubscript?minW_\textual contentminitalic_W begin_POSTSUBSCRIPT min finish_POSTSUBSCRIPT for minimal youngster weight, S?Sitalic_S for subsampling ratio, C?Citalic_C for column subsampling ratio, γ?\gammaitalic_γ for minimal loss discount required to make further splits, α?\alphaitalic_α for L1 regularization time period on weights, and λ?\lambdaitalic_λ for L2 regularization term on weights.

This parameter grid serves as a roadmap for systematically exploring varied combinations of hyperparameters to establish the optimum configuration for the XGBoost mannequin, thereby enhancing its predictive efficiency. The very best combination of hyperparameters for the XGBoost model was selected primarily based on the smallest RMSE, leading to enhanced predictive efficiency. The chosen parameters are as follows:

Finally, the RMSE achieved with this parameter mixture is the smallest observed through the hyperparameter tuning process.

V Results and Analysis

On this section, we offer simulations-based mostly evaluations of the proposed machine learning model. Particularly, we compute the Mean Absolute Error (MAE), RMSE, and R-squared (R2superscript?2R^2italic_R begin_POSTSUPERSCRIPT 2 finish_POSTSUPERSCRIPT).

MAE offers a simple and easy interpretation of the common absolute deviation between the predicted and actual values. It is easy to grasp and is less sensitive to outliers in comparison with different metrics like RMSE.

RMSE supplies a measure of the average magnitude of prediction errors in the same items as the goal variable. It penalizes larger errors extra heavily than MAE, making it notably useful when large errors are undesirable.

the place y¯¯?\baryover¯ start_ARG italic_y end_ARG is the imply of the particular values of the goal variable.

R2superscript?2R^2italic_R begin_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT Score gives a sign of how effectively the mannequin fits the information relative to a simple baseline mannequin (e.g., a model that at all times predicts the imply). It ranges from zero to 1, the place greater values indicate a better match. R2superscript?2R^2italic_R start_POSTSUPERSCRIPT 2 finish_POSTSUPERSCRIPT rating is extensively used for evaluating totally different fashions and assessing overall model performance.

Table III presents key analysis metrics for our regression mannequin. The RMSE is 59.9504, indicating the sq. root of the common squared distinction between predicted and actual values. The MAE is 46.2229, indicating the typical absolute difference between predicted and precise values. The model’s R2superscript?2R^2italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT Score is 0.9999, reflecting an exceptionally robust fit to the information. Overall, the model demonstrates excessive accuracy and predictive functionality, with low errors and a close to-excellent R2superscript?2R^2italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT rating.

Another method to evaluate the efficiency of the XGBoost Regressor model is to analyze the relationship between the predicted values and the residuals. Let ytestsubscript?testy_\texttestitalic_y start_POSTSUBSCRIPT test finish_POSTSUBSCRIPT be the true goal values from the test dataset, y^predsubscript^?pred\haty_\textpredover^ start_ARG italic_y end_ARG begin_POSTSUBSCRIPT pred finish_POSTSUBSCRIPT be the predicted target values from the mannequin, and ε?\varepsilonitalic_ε be the residuals calculated as ε=ytest−y^pred?subscript?testsubscript^?pred\varepsilon=y_\texttest-\haty_\textual contentpreditalic_ε = italic_y begin_POSTSUBSCRIPT check end_POSTSUBSCRIPT - over^ begin_ARG italic_y finish_ARG start_POSTSUBSCRIPT pred finish_POSTSUBSCRIPT.

Figure 2 exhibits a scatter plot of the residuals against the predicted values. The plot shows the connection between the predicted values (scaled by 1000) and the residuals (scaled by 10). A horizontal dashed line at y=0?0y=0italic_y = 0 indicates excellent prediction, where residuals are centered around zero. The plot illustrates the model’s means to foretell precisely across the vary of predicted values.

Furthermore, Figure 3 presents a scatter plot depicting the comparability between predicted values (in 1000s) and precise values (in 1000s). The diagonal dashed crimson line represents supreme prediction, where precise values align perfectly with predicted values. This plot presents perception into the model’s efficacy throughout the spectrum of actual values, showcasing its predictive performance.

V-A State-of-the-Art Comparison

Lastly, this subsection offers a comparison between the work proposed in this paper and present studies in the literature.

Table IV gives a complete comparison of various machine studying approaches in financial forecasting and buying and selling. Shynkevich et al. [32] leverage machine studying algorithms on day by day inventory price time series, attaining optimal performance by analyzing different forecast horizons and input window lengths. Similarly, Liu et al. [2] make use of SDAE deep studying fashions using historical data, public consideration, and macroeconomic elements, which end in superior prediction accuracy. As well as, Jaquart et al. [33] implement ensemble machine studying fashions on cryptocurrency market information (streamed from CoinGecko [35]), producing statistically important predictions and incorporating an extended-brief portfolio strategy. Furthermore, Hafid et al. [3] use a Random Forest classifier with historic data and a few technical indicators to attain high accuracy in market trend prediction, successfully signaling buy and promote moments.

Saad et al. [15] combine economic theories with machine studying, analyzing person and network exercise to realize excessive accuracy in price prediction and supply insights into network dynamics. Moreover, Akyildirim et al. [34] apply SVM, LR, ANN, and RF algorithms on historic worth information and technical indicators, demonstrating consistent predictive accuracy and development predictability. In distinction, this paper introduces a novel strategy utilizing an XGBoost regressor with technical indicators and historic information, attaining low MAE, RMSE, and an R2superscript?2R^2italic_R start_POSTSUPERSCRIPT 2 finish_POSTSUPERSCRIPT value near 1, thereby contributing a new machine studying strategy to the sphere.

VI Conclusion

Our research highlights the efficacy of the XGBoost regressor model in forecasting Bitcoin prices utilizing a mix of technical indicators and historic market knowledge. The model’s efficiency, as evidenced by the low Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) along with a close to-excellent R2superscript?2R^2italic_R start_POSTSUPERSCRIPT 2 finish_POSTSUPERSCRIPT worth, underscores its potential in offering accurate and reliable predictions within the extremely volatile cryptocurrency market. By incorporating regularization methods to mitigate overfitting and fine-tuning model parameters by means of an extensive grid search, we've got achieved a sturdy predictive model. Furthermore, the use of varied technical indicators such as the Exponential Moving Average (EMA), Moving Average Convergence Divergence (MACD), Relative Strength Index (RSI), and others, along side historic costs and quantity information, has proven effective in enhancing the model’s predictive capabilities. This method not only gives a complete evaluation of market trends but in addition facilitates better determination-making for traders and traders.

This work contributes to the sphere of monetary forecasting, significantly in the area of cryptocurrency price prediction. The findings suggest that machine learning models, when correctly calibrated and integrated with related technical indicators, can serve as powerful instruments for navigating the complexities of monetary markets. Future research may additional discover the combination of additional information sources and superior machine learning techniques to continue improving the accuracy and applicability of such models in dynamic trading environments.

References

- [1] Y.-L. Hsu, Y.-C. Tsai, and C.-T. Li, "Fingat: Financial graph consideration networks for recommending high-k?kitalic_okay okay profitable stocks," IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 1, pp. 469-481, 2021.- [2] M. Liu, G. Li, J. Li, X. Zhu, and Y. Yao, "Forecasting the price of bitcoin using deep studying," Finance analysis letters, vol. 40, p. 101755, 2021.- [3] A. Hafid, A. S. Hafid, and D. Makrakis, "Bitcoin value prediction using machine studying and technical indicators," in International Symposium on Distributed Computing and Artificial Intelligence. Springer, 2023, pp. 275-284.- [4] S. Nakamoto, "Bitcoin: A peer-to-peer digital cash system," Decentralized business review, 2008. [Online]. Available: http://dx.doi.org/10.2139/ssrn.3440802- [5] A. Hafid, A. S. Hafid, and M. Samih, "Scaling blockchains: A complete survey," IEEE entry, vol. 8, pp. 125 244-125 262, 2020.- [6] H. Sebastião and P. Godinho, "Forecasting and buying and selling cryptocurrencies with machine studying underneath altering market circumstances," Financial Innovation, vol. 7, no. 1, pp. 1-30, 2021.- [7] A. M. Khedr et al., "Cryptocurrency worth prediction utilizing conventional statistical and machine-studying strategies: A survey," Intelligent Systems in Accounting, Finance and Management, vol. 28, no. 1, pp. 3-34, 2021.- [8] N. Nazareth and Y. V. R. Reddy, "Financial applications of machine studying: A literature assessment," Expert Systems with Applications, vol. 219, p. 119640, 2023.- [9] S. Tanwar et al., "Machine learning adoption in blockchain-based smart applications: The challenges, and a manner forward," IEEE Access, vol. 8, pp. 474-488, 2019.- [10] J. B. Awotunde et al., "Machine learning algorithm for cryptocurrencies price prediction," in Artificial Intelligence for Cyber Security: Methods, Issues and Possible Horizons or Opportunities. Springer, 2021, pp. 421-447.- [11] A. Greaves and B. Au, "Using the bitcoin transaction graph to foretell the value of bitcoin," No Data, 2015.- [12] H. Jang and J. Lee, "An empirical research on modeling and prediction of bitcoin costs with bayesian neural networks based mostly on blockchain info," IEEE Access, vol. 6, pp. 5427-5437, 2017.- [13] S. McNally et al., "Predicting the price of bitcoin utilizing machine learning," in 2018 twenty sixth euromicro worldwide conference on parallel, distributed and community-based mostly processing (PDP). IEEE, 2018, pp. 339-343.- [14] S. Velankar et al., "Bitcoin worth prediction utilizing machine studying," in 2018 20th International Conference on Advanced Communication Technology (ICACT). IEEE, 2018, pp. 144-147.- [15] M. Saad et al., "Toward characterizing blockchain-based mostly cryptocurrencies for extremely accurate predictions," IEEE Systems Journal, vol. 14, no. 1, pp. 321-332, 2019.- [16] G. Wood et al., "Ethereum: A secure decentralised generalised transaction ledger," Ethereum project yellow paper, vol. 151, no. 2014, pp. 1-32, 2014.- [17] E. Sin and L. Wang, "Bitcoin price prediction utilizing ensembles of neural networks," in International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). IEEE, 2017, pp. 666-671.- [18] E. Christoforou et al., "Neural networks for cryptocurrency analysis and worth fluctuation forecasting," in Mathematical Research for Blockchain Economy. Springer, 2020, pp. 133-149.- [19] Z. Chen, C. Li, and W. Sun, "Bitcoin worth prediction utilizing machine learning: An method to pattern dimension engineering," Journal of Computational and Applied Mathematics, vol. 365, p. 112395, 2020. [Online]. Available: https://doi.org/10.1016/j.cam.2019.112395- [20] E. Akyildirim et al., "Forecasting mid-worth movement of bitcoin futures utilizing machine studying," Annals of Operations Research, vol. 330, no. 1, pp. 553-584, 2023.- [21] K. Martin et al., "Combining blockchain and machine learning to forecast cryptocurrency prices," in International Conference on Blockchain Computing and Applications (BCCA). IEEE, 2020, pp. 52-58.- [22] S. Liu et al., "Financial time-collection forecasting: Towards synergizing performance and interpretability inside a hybrid machine studying method," arXiv preprint arXiv:2401.00534, 2023.- [23] K. He et al., "Financial time sequence forecasting with the deep learning ensemble model," Mathematics, vol. 11, no. 4, p. 1054, 2023.- [24] A. Alfatemi, M. Rahouti, R. Amin, S. ALJamal, K. Xiong, and Y. Xin, "Advancing ddos assault detection: A synergistic approach utilizing deep residual neural networks and synthetic oversampling," arXiv preprint arXiv:2401.03116, 2024.- [25] P. K. Nagula and C. Alexakis, "A new hybrid machine learning model for predicting the bitcoin (BTC-USD) value," Journal of Behavioral and Experimental Finance, vol. 36, p. 100741, 2022.- [26] A. Petrovic et al., "Cryptocurrency price prediction by utilizing hybrid machine learning and beetle antennae search approach," in Telecommunications Forum (TELFOR). IEEE, 2021, pp. 1-4.- [27] R. Gupta and J. E. Nalavade, "Metaheuristic assisted hybrid classifier for bitcoin price prediction," Cybernetics and Systems, vol. 54, no. 7, pp. 1037-1061, 2023.- [28] C. Luo et al., "Bitcoin price forecasting: an built-in method using hybrid lstm-elm fashions," Mathematical Problems in Engineering, vol. 2022, 2022.- [29] "Data from binance api," https://www.binance.com/.- [30] S. B. Achelis, "Technical evaluation from a to z," 2001.- [31] T. Chen and C. Guestrin, "Xgboost: A scalable tree boosting system," in Proceedings of the 22nd acm sigkdd worldwide convention on knowledge discovery and knowledge mining, 2016, pp. 785-794.- [32] Y. Shynkevich, T. M. McGinnity, S. A. Coleman, A. Belatreche, and Y. Li, "Forecasting worth movements utilizing technical indicators: Investigating the impact of varying enter window length," Neurocomputing, vol. 264, pp. 71-88, 2017.- [33] P. Jaquart, S. Köpke, and C. Weinhardt, "Machine studying for cryptocurrency market prediction and buying and selling," The Journal of Finance and Data Science, vol. 8, pp. 331-352, 2022.- [34] E. Akyildirim, A. Goncu, and A. Sensoy, "Prediction of cryptocurrency returns using machine learning," Annals of Operations Research, vol.

댓글목록

등록된 댓글이 없습니다.


Copyright © http://seong-ok.kr All rights reserved.