In recent years, the digital transformation of government services has emerged as a critical global priority. Nevertheless, e-government remains an active and evolving research field, as many countries have only implemented partial solutions and continue to face unresolved technical and organizational challenges. As stated in [1], “the development of a shared e-government knowledge base is one of the key challenges of many e-government strategies”. This challenge arises from the heterogeneity of government entities, which hinders seamless interoperability and secure data exchange. To overcome such challenges, Semantic Web technologies – such as RDF, OWL, and SPARQL -have been increasingly adopted to construct unified, standards-based knowledge frameworks. These technologies support semantic interoperability across distributed systems and offer promising tools for integrating government services. However, as noted in [2], “much research in the Semantic Web and Linked Data domain has focused on enabling the sharing of open datasets” often overlooking essential security and access control requirements that are critical in sensitive domains such as public administration. This research focuses on a critical aspect of secure e-government: access control. Although ensuring robust security in public administration is imperative, the integration of semantic web methods into these systems frequently exposes vulnerabilities—particularly within access control mechanisms. In response, we propose an innovative solution that reinforces the conventional Role-Based Access Control (RBAC) model. Our approach integrates ontology-driven methodologies to dynamically implement access policies, ensuring that only authorized users gain access to sensitive information. The central hypothesis of this study is that embedding semantic web technologies into access control frameworks not only improves data interoperability but also significantly enhances security by preventing unauthorized access and ensuring proper user authentication. To validate this hypothesis, we designed and implemented a prototype using Apache Jena Fuseki alongside semantic web technologies such as RDF, OWL, and SPARQL. The prototype was evaluated in an e-government context, demonstrating that dynamic semantic reasoning and flexible policy updates can effectively meet the complex security requirements of distributed public services. The results indicate that our approach supports scalable, interoperable, and secure e-government systems, paving the way for broader adoption of semantic web technologies in public administration. This paper contributes to bridging the gap between theoretical research and practical application in the fields of information security, semantic web, and public administration. By integrating semantic reasoning with enhanced access control, our work presents a practical framework that addresses the key challenges of data interoperability and security within e-government systems. A review of the literature reveals extensive research on both Semantic Web applications and e-government systems. Previous studies have tackled issues such as data heterogeneity, interoperability challenges, and security vulnerabilities. Multiple methodologies have been proposed for integrating semantic technologies into public administration, with particular attention to the dynamic enforcement of access control policies and the use of ontologies for modeling complex governmental data. Building on these findings, our work presents a comprehensive solution that unifies semantic data sharing with enhanced access control, thereby addressing both integration and security requirements in e-government environments.
Semantic Web-Based E-Government
Semantic Web technologies have become a cornerstone for achieving interoperability and data integration in e-government systems. Study [3] mapped a range of case studies; for example, [4] developed a domain ontology for Nepal’s citizenship certificates, improving issuance accuracy and efficiency, and [5] introduced semantically reusable Web Components that measurably enhance response time and interoperability—while also noting that practical deployment details remain underexplored. Concrete prototypes further illustrate these insights: [6] harmonized civil, health, and education schemas into a unified OWL ontology, enhancing consistency and query precision; and [7] implemented an OWL-based integration platform in Kuwait, enabling real-time semantic queries across ministries. At the national level, [8] showcases Finland’s Semantic Web infrastructure: a cross-domain ontology “layer cake” and a series of Linked Open Data portals built on SPARQL endpoints. For over two decades, this infrastructure has supported hundreds of applications, proving that scalable, government-wide semantic integration is both feasible and impactful. Study [9] surveyed RDF, OWL, and SPARQL applications across public-sector services, categorizing technical and socio-economic challenges—particularly around security and real-world deployment— and concluded that the semantic web lacks the maturity of a production-grade artifact, calling for increased focus from both academia and industry. Together, these studies trace the evolution from targeted domain ontologies to large-scale national frameworks, paving the way for our ontology-based RBAC solution that combines semantic data sharing with dynamic access control.
Semantic Web-Based Access Control
Since the inception of semantic web technologies, many studies have investigated their application in access control to address security vulnerabilities in distributed systems. Researchers have explored various models, including DAC (Discretionary Access Control), MAC (Mandatory Access Control), RBAC, and ABAC (Attribute-based Access Control). These studies have often yielded the following findings:
For MAC and DAC, studies such as [10] have focused on defining vocabularies that support multiple access control models using DAML+OIL (Darpa Agent Markup Language + Ontology Inference Layer) ontologies. Similarly, [11] proposed an attribute-based model to overcome heterogeneity in distributed environments, supporting MAC, DAC, and RBAC.
In the domain of RBAC, recent works have advanced semantic role modeling and multi-domain integration. [12] proposes an intelligent RBAC framework that defines “semantic business roles” via OWL ontologies and enables policy evaluation across organizational boundaries. Additionally, [13] introduced a semantic security platform that implements an enhanced RBAC model (merging RBAC and ABAC) using ontology modeling techniques. [14] presents a feature-oriented survey of ontology- and rule-based access control systems with a focus on conflict resolution and dynamic decision making. [15] demonstrates an ontology-based data access case study in which semantic queries enforce role assignments and permissions within a distributed environment, validating practical applicability.
Regarding ABAC, studies have focused on attribute-driven policy enforcement and fine-grained control. [16] introduces a semantic ABAC model based on ontology-defined attributes and context rules for adaptive access decisions. [17] extends semantic ABAC to e-Health, designing an ontology that maps user, resource, and contextual attributes to enable secure, fine-grained medical data access, and [18] presents a general ontology for access control that performs effectively in large-scale, heterogeneous environments.
Collectively, these studies demonstrate that semantic web technologies can effectively support various access control models. However, challenges remain when applying these technologies in environments with sensitive data, such as e-government systems.
MATERIALS AND METHODS
In this section, we propose a solution for information sharing in an e-government environment, as well as an access control mechanism within that environment. Our approach builds on foundational ontology‐design methodologies from recent research studies [19, 20, 21], adheres to widely accepted semantic web standards, including RDF, RDFS (Resource Description Framework Schema), OWL, and SPARQL, and employs Protégé platform and GraphDB’s visual graph feature for ontology development and visualization. Semantic data is stored, queried, and managed in an Apache Jena Fuseki triple store, while the Semiodesk Trinity framework provides seamless .NET integration with Fuseki. The web application layer is implemented using ASP.NET MVC 5 and ASP.NET Core within Visual Studio 2022. This foundation enables the implementation of a scalable, interoperable, and secure e-government system that integrates semantic reasoning and dynamic access control policies.
Proposed Solution for E-Government Information Sharing
In this study, we assume the existence of four government entities, each developing its own application while enabling information and knowledge sharing among themselves. These entities are:
*Ministry of Health
*Ministry of Labor
*Ministry of Higher Education
*Civil Registry
To facilitate interoperability, a simplified yet expandable ontology was designed for each entity.
Ministry of Health Ontology: This ontology consists of the following classes: mc-Patient, mc-Hospital, mc-Injury, and mc-InjuryDetails. See Figure 1.
Figure 1 – Ministry of Health Proposed Ontology
Ministry of Labor Ontology: This ontology includes the classes: mc-Beneficiary, mc-EmploymentRequest, and mc-FamilySupport.
Ministry of Higher Education Ontology: This ontology is composed of the classes: mc-StudentProfile, mc-Course, and mc-Exam.
Civil Registry Ontology: It contains a single core class: mc-PersonProfile, which stores the personal information of citizens. And an auxiliary class mc-Citizen is introduced as a container to link the other ontologies. See Figure 2.
Figure 2 – Civil Registry Proposed Ontology
The ontology model ensures that each government entity maintains its own structured data while remaining interoperable through shared concepts.
Information Sharing Among E-Government Ontologies
The proposed solution establishes semantic relationships between different government entities by defining mc-Patient (Ministry of Health), mc-Beneficiary (Ministry of Labor), and mc-StudentProfile (Ministry of Higher Education) as subclasses of mc-PersonProfile class (Civil Registry). See Figure 3. By applying inheritance principles, any instance created in the sub-classes automatically inherits its personal data from the corresponding mc-PersonProfile instance in the Civil Registry. This ensures that all citizens—whether they are students, beneficiaries, or patients—are first recognized as individuals within the national Civil Registry system before being associated with specific government sectors. This ontology-driven approach enhances data consistency, reduces redundancy, and enables seamless information retrieval across multiple government institutions, forming the foundation for a unified and interoperable e-government system.
Figure 3 – E-Gov Proposed Ontologies
To demonstrate information sharing among e-government ontologies, several SPARQL query examples are provided in the supplementary materials.
Proposed Solution for Access Control
The RBAC (Role-Based Access Control) model was selected as the foundation of the proposed solution due to the structured role-based nature of e-government institutions. Since government environments typically have well-defined actor roles, RBAC provides a policy-neutral, manageable, and scalable approach to access control. As stated in study [22]: “Role-Based Access Control models appear to be the most attractive solution for providing security features in multidomain e-government infrastructure. RBAC features such as policy neutrality, principle of least privilege, and ease of management make them especially suitable candidates for ensuring safety in e-government environment”. RBAC is commonly classified into four levels, ranging from the simplest to the most advanced: Flat RBAC, Hierarchical RBAC, Constrained RBAC, Symmetric RBAC. Each level builds upon the previous one. Since the goal of this research is to develop a simple yet expandable solution, the proposed approach implements Flat RBAC, while ensuring that future extensions to Hierarchical and Constrained RBAC are feasible. Following ontology design principles, we begin by modeling a core RBAC ontology, depicted in Figure 4, that conforms to the Flat RBAC standard.
Figure 4 – Conceptual RBAC Ontology Based on the Flat RBAC Model
This conceptual design captures the essential components of role-based access control (User, Role, Permission) and serves as a foundation for a more practical implementation. By decoupling permissions into explicit (Action-Resource) pairs, this ontology enforces clear semantics for each access right. However, while theoretically sound, the core model is not optimally structured for direct use in real-world applications due to its abstract handling of permission granularity and lack of support for multi-application contexts. To address these limitations and enable the application of RBAC in real deployment environments, we extend and restructure the initial model into a more implementation-oriented ontology, as shown in Figure 5.
Figure 5 – Application-Oriented RBAC Ontology for E-Government Access Control
This model replaces the triplet of Permission, Action, and Resource with a single Method class, which represents functions or procedures within the system that users interact with. It also adds two more classes: Credential (for user authentication), and Application (for managing access within multiple systems). This ontology enables administrators to assign methods (i.e., grouped action-resource operations) to roles per application, and to authenticate users via credentials before role activation. The proposed model successfully meets the fundamental requirements of the Flat RBAC standard:
Users acquire permissions (Methods in our case) through roles.
Both user-role assignments and permission-role assignments (Method-Role assignments in our case) follow a many-to-many relationship.
The system supports user-role review.
Users can exercise permissions associated with multiple roles simultaneously.
These compliance criteria were verified through SPARQL queries, which are provided in the supplementary materials for reference.
RESULTS
The research resulted in the development of the following applications:
Access Control Application:
This application enables administrators to define users, roles, and permissions, effectively implementing the Role-Based Access Control (RBAC) model.
Additionally, it provides an API service that allows e-government applications to request user access verification and make allow/deny decisions accordingly.
E-Government Applications:
These applications utilize the access control system for managing secure access while supporting interoperable data exchange among government entities.
Implementation of the Access Control Management Web Application
A web-based application was developed to serve as the administrative interface for the Access Control System. This application was built using ASP.NET MVC5, leveraging the Semiodesk Trinity platform for data layer integration. It provides system administrators with full control over Applications, Users, Roles and Permissions (Methods). Additionally, the system is designed to manage itself, incorporating authentication and authorization mechanisms.
Authentication and Authorization Process
User Login (Authentication):
The system verifies user credentials by searching for a matching username and password in the stored user data.
Upon successful authentication, the system retrieves the roles assigned to the user and the permissions linked to those roles.
Authorization Mechanism:
Once authenticated, the system determines whether the user is authorized to access a specific method.
For example, when a user requests the home page (Index) within the HomeController of the Access Control Application, the system evaluates whether the method’s signature (AppRbac_Home_Index) exists within the user’s assigned permissions.
If a match is found, the user is granted access, and the requested page is displayed.
This authorization mechanism is enforced throughout the application. Each time a user navigates between interfaces or performs an action, the system validates their authorization to invoke the corresponding method, ensuring role-based access control. The following Figure 6 illustrates one of the user interfaces of the application.
Figure 6 – Roles Management in Access Control Application
Integration with E-Government Applications via AppMgr.Api
A dedicated API service (AppMgr.Api) was developed to facilitate communication between the Access Control System and e-government applications. This service is invoked by e-government applications whenever a user attempts to log in.
When an e-government application sends a login request, it includes the username and password of the user.
The authentication process follows the same mechanism described earlier:
The system verifies the credentials.
If authentication is successful, the user’s roles and permissions are retrieved.
The API response includes:
The user’s assigned roles and permissions.
The URL to which the user should be redirected upon successful login.
Unlike the Access Control Management Web Application, authorization is not handled by the API itself. Instead, each e-government application processes authorization internally, relying on the permissions received from the API.
This modular approach ensures flexibility, allowing each e-government system to enforce role-based access control (RBAC) policies based on its specific operational requirements.
E-Government Applications and Information Sharing Between Them
The e-government applications were developed using ASP.NET Core, in combination with Semiodesk Trinity and the Apache Jena Fuseki triple store. These applications were integrated with the Access Control Service, which manages both authentication and authorization processes.
Example: Patient Registration in the Ministry of Health Application
As shown in Figure 7, when registering a new patient in the Ministry of Health application, the system first performs a query using the citizen’s national ID in the Civil Registry application.
The registry retrieves and returns personal information, and the Ministry of Health user adds the patient’s medical details.
Figure 7 – (Add Patient) Interface in the Ministry of Health Application
Similarly, new beneficiary registrations in the Ministry of Labor application and student registrations in the Ministry of Higher Education application rely on retrieving personal details from the Civil Registry. This demonstrates the seamless interoperability and efficient data sharing enabled by the semantic integration model.
Example: Sharing Medical Records Between Applications
Figure 8 illustrates the family support interface in the Ministry of Labor application, where the amount of support is calculated based on the injury percentage of each beneficiary.
The injury percentages data originates from the Ministry of Health ontology, further validating the effectiveness of semantic information sharing across government applications.
Figure 8 – (Family Support) Interface in the Ministry of Labor Application
Reasoning Activation in E-Government Applications
To enhance data inference capabilities, a reasoning engine was activated within the Fuseki triple store using the OWLMicroFBRuleReasoner. Example of Automated Inference:
The reasoning engine allows the system to derive new knowledge that was not explicitly stored in the triple store.
Consider the following inverse relationships between the Exam and Course classes:
Exam → has_exam → Course
Course → exam_has_course → Exam
If the triple (course1 has_exam exam1) is added, the reasoning engine automatically infers the inverse relationship:
(exam1 exam_has_course course1)
This inference is dynamically added to the e-government dataset, ensuring data consistency and completeness.
The effectiveness of this semantic reasoning mechanism was successfully tested in the student exam details interface, along with several other logical inferences within the applications.
DISCUSSION
Our work advances both semantic information sharing and access control in ways that address the limitations noted in prior studies. Unlike study [7], which proposed ontologies without implementation, we developed a working prototype that demonstrates real-time data exchange across government domains. In contrast to study [1], which lacked a mechanism to identify the appropriate authority for a given service, our model integrates an ontology-driven RBAC system to securely handle such decisions. From an access control perspective, the study validates that Semantic Web technologies can effectively implement a Role-Based Access Control (RBAC) model through ontology-driven mechanisms. While most access control research remains theoretical or limited to less-sensitive domains such as Online Social Networks or cloud platforms [18], our solution is applied in an e-government context, managing sensitive data through a fully implemented, policy-aware system. While this work focused on the Flat RBAC model, its semantic foundation facilitates natural extensions to Hierarchical and Constrained RBAC.
Evaluation Criteria and System Assessment
To further assess the quality and applicability of the proposed system, we evaluated it against commonly accepted criteria in semantic e-government research, as outlined below:
This qualitative evaluation demonstrates that the proposed solution is not only conceptually sound but also practical, modular, and aligned with real-world public-sector requirements.
CONCLUSIONS AND RECOMMENDATIONS
This research introduced a semantic web-based framework for secure information sharing and access control in e-government environments. The study confirmed that by leveraging ontologies and reasoning engines, government systems can achieve improved interoperability, reduced redundancy, and scalable architecture—while also supporting dynamic, fine-grained access control mechanisms. The integration of ontology modeling with access control policies strengthens both security and flexibility in distributed digital services.
Based on these findings, the following recommendations are proposed:
Expand e-government ontologies by integrating additional domain-specific concepts and linking to existing public ontologies on the web to enhance service coverage.
Extend the access control ontology to support Hierarchical RBAC and Constrained RBAC, leveraging OWL constructs to model complex permission structures.
Deploy the developed applications on the public web, hosted by trusted national IT infrastructures, to enable citizen-facing services while maintaining data protection and system integrity.
INTRODUCTION Recently, a growing number of warnings have been issued about the fate of life on planet Earth. Its ecosystem, with all its components, has been constantly polluted -both organic and inorganic- due to the combination of the spread of industrial revolutions and the increase in various human activities. Revolutions in the textile and oil industries have had a great potential to cause terrible and catastrophic deterioration of the aquatic environment, and they did what they did in the past and what these deteriorations have led to today (1). In several Asian and African countries, as a result of population growth and to secure greater economic returns for the country, governments have turned the wheel of production to expand textile, medical, pesticide, and other industries as a central tributary to strengthening and growing their economies. But the development rewards have not been good with regarding the production of organic dyes and petroleum-based pesticides. Production has reached immeasurable levels – tens of thousands of tons of dyes, pesticides, and even raw materials for medicines – causing negative impacts on rivers, drinking water sources, and brackish water sources (1,2). Not all scientific studies have concealed the fact that the gradual accumulation of byproducts from that production process (heavy metals, fillers, lubricants for production equipment, etc.) is being introduced as non-biodegradable waste into water, exposing water bodies to potentially unavoidable hazards in the future. The strong interconnectedness of the living and non-living components of water and the ease of movement of organic contaminants between them increases the complexity of water pollution and the interconnection of this pollution with other media. These contaminants affect light penetration in water, impair the formation of chlorophyll in aquatic plants, increase the rate of anaerobic fermentation, cause the death of marine organisms, decrease the levels of important ions (potassium, sodium, chloride, etc.), and other effects caused by water pollution with organic matter. Organic contaminants fall into different families, classified according to their toxicity or chemical composition: colored azo contaminants (homogeneous and heterocyclic aromatic compounds), petroleum contaminants (monocyclic aromatic compounds), and dozens of dirty groups of hydrocarbon compounds (POPs and PAHs), etc. (3). This group, out of one hundred and twenty-nine known priority contaminants, as described in the U.S. Clean Water Act, silently depletes and kills aquatic resources with dire consequences. This consequence falls within the scope of the characteristics of persistent organic contaminants, including: accumulating capacity, complexity of chemical composition, heterogeneous and unstable distribution between solid and liquid phases, high solubility in lipids, and bioaccumulation in human and animal tissues (4-6). According to the reports of researchers Al-Tohamy (1), Bishnoi (2), and Abu-Nada (7), phenolic hydrocarbons and halogenated monoaromatic hydrocarbons, commonly used as pesticides, are highly enriched compared to polycyclic hydrocarbons in agricultural soils, wastewater, and industrial sludge. Researchers’ conclusions regarding the reason behind this enrichment have converged (1) (2) (7). Their conclusions regarding the increased enrichment in soils and wastewater media were as follows: phenolic and halogenated compounds can interact with available organic compounds through an adsorption mechanism (8). Over the past few decades of the last century and the current one, the lofty goal of preserving the environment’s water resources, in the first place, and other environments in the second place, has been a constant preoccupation and major concern for researchers. This has been highlighted by the submission and publication of thousands of quality articles aimed at finding ways to treat water from various forms and types of organic contaminants. Given the unresponsiveness of complex contaminants to environmental degradation – accomplished through chemical or biological reactions – and the antiquated nature of previously designed treatment methods, technology researchers have emphasized the creating of interdisciplinary collaboration environment between chemistry and environmental science to generate brighter and more qualitative solutions for water treatment. After continuous research and arduous experiments, this alliance has resulted in the development of a new generation of ultra-small materials – so-called nanomaterials – in various polymeric, metallic, and organic forms, using modern and sustainably developed methodologies. Nanoscale researchers are obsessed with using metallic compounds (primarily noble metal nanoparticles) with their excellent optical/magnetic/structural/crystalline/surface properties to neutralize a significant portion of organic contaminants in water. Many methods based on composites/hybrids/alloys of small-sized noble metal particles have been proposed for contaminant removal, including precipitation, coagulation, adsorption, and others (9). The photoreduction method relies on the presence of different light sources (infrared, ultraviolet, visible light) and relies on photoactive materials such as Cd-MOF (10), zinc oxide (11), cadmium sulfide (13), and zero-valent iron nanoparticles (14). This method is characterized by its economy, ease of application, and low environmental side effects. The basic premise of the photoreduction mechanism revolves around two fundamental points: the first is the change in the bandgap value of the nanoscale catalyst with degradation ability, and the second is the surface plasmon resonance (SPR) property. Regarding the first point, two different semiconductors, p and n, must be available to generate a continuous cascade of electron-hole pairs. Regarding the second point, this property arises from the collective movement of free electrons localized on the surface of nanoparticles (especially gold, silver, copper, and platinum “to a lesser extent”) when light falls on them (15). Due to their thermal stability, chemical inertness to oxidizing agents, bioactivity, unique surface properties, and the possibility of generating them at nanoscale and in various morphologies, noble metal particles (Pd, Rh, Au, Ag, Pt) have attracted the attention of many biological researchers, chemists, and bioengineers in many applications (16) (17) (18). For their part, researchers interested in environmental cleanliness and preserving it from any imminent danger are increasingly developing methods for using these metals in environmental applications such as advanced oxidation of organic compounds, reducing the effects of toxic gas emissions from internal combustion engines in transportation, water splitting, and more (19). Many researchers have utilized noble metal nanoparticles in the catalytic reactions of organic compounds (dyes, petroleum derivatives, pesticides) – after loading them onto the surface of metal oxides such as titanium oxide, zinc oxide, copper ferrite, etc., or applying harsh reaction conditions – to increase catalytic activity and accelerate contaminant removal (5) (10) (11). Liu presented a paper on the effect of crystallization on the catalytic performance of titanium oxide supported by gold particles (16). Liu found that the improved crystalline properties with the presence of gold particles favorably accelerated the catalytic degradation process of a number of contaminants (16). In another paper by Zheng et al., it was demonstrated that the combination of zinc oxide with silver zero-valent “Ag(0)” resulted in a positive improvement in electron-hole generation, which in turn improved the degradation performance of the nanostructure under visible light irradiation (20). In Zheng’s paper (20), the synthesis of three metallic nanoparticles using ultrasonication in a weakly alkaline medium and in the presence of sodium borate tetrahydride was reported. Each metallic nanoparticle was characterized by its own nanoscale structure (morphology and crystallography). This study aimed to establish the foundations of green chemistry, particularly by utilizing the probe-ultrasound method to prepare three different nanoscale catalysts (Ag NPs, Au NPs and Pt NPs) under safe and easy-to-use conditions. The different structural properties that resulted from their characterization paved the way for their applicability in catalytic reactions using a simulated sunlight source (in the visible range “λ= 200-800 nm”). Crucially, these differences in properties led to a tangible comparative study between the catalytic decomposition results of the four contaminants, p-NP, MB, TCB, and Rh B. The novelty presented in this research is the environmental sustainability of the prepared particles, as these nanoparticles can be reused multiple times with high efficiency. Ag NPs demonstrated the highest photoreduction catalytic performance in removing all contaminants from aqueous media at all applied concentrations. Pt NPs ranked second in the photoreduction reaction, followed by Au NPs. The photoreduction behaviors differed with the contaminant type. The excellent reusability rates evinced clearly that the three groups of prepared particles are efficient for future photoreduction applications.
MATERIALS AND METHODS
All chemicals, as listed below, used in this paper were supplied from Sigma-Aldrich (China) without further purification: HAuCl4. 3H2O (≥ 99.99%, Au basis), H2PtCl6.6H2O (≥ 37.50%, Pt basis), AgNO3 (≥ 99.00%, trace metal basis), ethylene glycol (EG) ((CH2)2(OH)2 anhydrous, 99.8%), hydrazine (N2H4.H2O, 80.00%), methylene blue (MB, C16H18ClN3S · xH2O, ≥95.00 %), para-nitrophenol (p-NP, O2NC6H4OH, ≥99.00%), 2,4,6-tricholrobenzene (TCB, Cl3C6H2SO2Cl, ≥96.00 %) and Rhodamine B (Rh B, C28H31ClN2O3, ≥ 95.00 %). In order to prepare the different photocatalysts considered in this paper (Pt NPs, Au NPs and Ag NPs), a suitable molar ratio of each metal precursor (“0.13653 g (HAuCl4.3H2O)”, “0.13725 g (H2PtCl6.6H2O)”, “0.75295 g (AgNO3)”) was mixed with 50 mL of EG in three separate beakers. The solution was heated at 75 °C for four hours with gentle magnetic stirring, observing the initial color change (in the Au3+/EG solution from yellow to very dark gold, in the Pt4+/EG solution from intense orange to orange-brown, in the Ag+/EG solution from transparent to pale gray). Then, the glass beakers were transferred to an ultrasonic probe system (sono-horn made of titanium metal, 12.5 mm in diameter, operating at 20 kHz with a maximum power output of 600 W). Each solution was sonicated according to the following profile (300 sec “on”, 120 sec “off”, at 75 °C, time sonication of 25 min, 150 Watt). During sonication, sodium hydroxide solution (2 M) was added until the pH of the medium became 12, then 5 ml of hydrazine solution (10% v/v) was added dropwise. The colors of the formed precipitates were as follows: black (in the case of Pt NPs), dark brown (in the case of Au NPs) and dark gray (in the case of Ag NPs). Each precipitate was washed several times with a mixture of ultrapure water/ethanol (1:2 v/v) to remove any remaining unreacted materials. In the final stage, each precipitate was dried at 90 °C for 12 h. Figure 1. represents the schematic of the preparation stages by probe sonication of nanoscale particles based on noble metals (Pt NPs, Au NPs and Ag NPs).
Figure. 1. Schematic of the preparation stages of photocatalyst nanoparticles (Pt NPs, Au NPs and Ag NPs)
The photoreduction catalytic reaction of four hazardous organic pollutants – methylene blue (MB), para-nitrophenol (p-NP), Rhodamine B (RhB), and 2,4,6-trichlorobenzene (TCB) – in the presence of NaBH₄ under visible light irradiation was employed as a model photoreduction catalytic reaction to evaluate the reduction catalytic performance of the synthesized noble metal nanoparticles (Pt NPs, Au NPs, and Ag NP). A NaBH4 solution 0.26 Mm was prepared and stored in the dark. In a typical photoreduction test of the contaminants, 10.00 mg of the nano-catalyst (Pt NPs, Au NPs and Ag NPs) was poured separately into the aqueous solution of the related contaminant (10 mL, 10 mg.L-1 “ppm”), then ultrasonicated at room temperature for 60 sec. 100 μL of NaBH4 aqueous solution (0.26 mM) was mixed with the contaminant solution. After sonication, the solutions were exposed to visible light for three continuous hours. 5 mL of suspension – containing both the photocatalyst and the target contaminant – was taken out and centrifuged at 6000 rpm. All irradiations were performed using a white LED lamp (the radiant intensity (3 mw/cm2) in the wavelength range 400-780 nm with 10% of this in the ultraviolet range, and power density of 7-10 W at 0.0083 A, optical rising time 7 ns, intensity of the illumination 400 µW.cm-1 and ≥ 10 mm of diameter) as a solar-simulated light source. The photoreduction outcomes were read using a UV-Vis. spectrophotometer and using the Beer-Lambert law at a prominent wavelength for each contaminant solution (λ=664 nm for MB, λ=405 nm for p-NP, λ=555 nm for Rh B and λ=265 nm for TCB), which corresponded to the maximum absorbance of the contaminant mother solution. Dye uptake can also quantified using the efficiency of dye photocatalysis given by using the following equation 1:
Where, Co is the initial concentration of the contaminant solution in terms of mg.L-1 and Ce is the equilibrium concentration of the contaminant solution in terms of mg.L-1. The photoreduction efficiency of contaminants from their aqueous solutions depends strongly on the initial concentration. In order to assess, different concentrations of each contaminant (5, 10 ,15 and 20 mg.L-1) were tested at pH 7 with 10 mg of each nanocatalyst added into 10 mL solutions at 20 ˚C. The level of catalyst reusability plays an important role in these applications. After each catalysis cycle, for the first time, the nano-catalyst was separated from the reaction by centrifugation, washed with ultrapure water/ethanol, and then dried at 90 ° C for 12h (21). The applied conditions of the photoreduction reaction are summarized in Table 1.
Powder X-ray diffraction (PXRD) measurements were implemented using X’ pert pro. Analytical company with Cu-Kα radiation (λ= 1.5406 Å, scanning rate of 0.02 θ·s⁻¹, operating at 40 kV and 40 mA) to determine the crystal phases of the nanocatalysts. Field emission scanning electron microscopy (FESEM) with an accelerating voltage of 3 kV (MAIA3, TESCAN, Czech Republic) and transmission was applied to examine the morphology/size of the nanocatalysts. Energy Dispersive Spectroscopy (EDS) analysis was acquired by a “MAIA3, TESCAN” at the 15 kV acceleration voltages. The internal structure morphology of the (Pt NPs, Au NPs, Ag NPs) and the variation of the concentrations of the colored solutions of the contaminants were studied using TEM images (model Zeiss-EM10C-100KV, operating at an accelerating voltage of 160 kV) and dual-band UV-Vis. Spectroscopy in quartz cells (Shimadzu, mini 1240 (UV), in the wavelength range of 200-800 nm).
RESULTS
The crystalline state of Ag NPs, Au NPs and Pt NPs was examined by the X-ray diffraction patterns (Figure 2.). The diffraction peaks observed for the prepared Ag NPs were related to the following Miller indices (1 1 1), (0 0 2), (0 2 2), (1 1 3) and (2 2 2), which were located at diffraction angles of 38.12°, 44.39°, 64.54°, 77.49° and 81.6°. According to what this pattern showed and its comparison with many related references (17) (22) (23), it is clear that the Ag NPs were associated with the reference card Ag NPs (JCPDS-04-0783). On the other hand, as shown in the XRD pattern in Figure 2., for Au NPs, five diffraction peaks can be observed located at diffraction angles of 38.18°, 44.43°, 64.87°, 77.78° and 82.22°, which were related to Miller indices (1 1 1), (0 0 2), (0 2 2), (1 1 3) and (2 2 2), respectively. The characteristic diffraction pattern of Au was referenced in JCPDS card no. 04-0784 (17). The XRD pattern (Figure 2.) showed that Pt NPs main peaks were observed at 39.80°, 46.01°, 67.35° and 88.60°, which were almost identical to the reference card for Pt NPs (JCPDS 04-0802) (17). Thus, the reduction of silver, gold and platinum ions and the production of pure samples without impurities were confirmed. The crystal grain sizes, degree of crystallinity and orientation degree of those were calculated by the corresponding equations, which were reported in many papers (17) (18) (21), as shown in Table 2.
Figure 2. XRD patterns of Pt NPs, Au NPs and Ag NPs
Figure 3.(A-F) presents the FESEM micrographs of the synthesized nanocatalyst particles. As disclosed in Figure .3(A&B), the Pt NPs had the shape of small cauliflower buds with slightly rough surfaces (see supplementary material file in Figure S1.(A&C)) and a small spherical-like shape with an average size of 28.71 nm. Some sheets were also observed to be heterogeneously distributed (see supplementary material file in Figure .S1(B)). According to the FESEM images in Figure .2(C&D), the Au NPs contained small pits (indicated by blue arrows, see supplementary file material in Figure .S2(A&B)) and their shape was similar to a smooth/twisted surface, stacked side by side, resembling a cactus plant (indicated by orange arrows, see supplementary file material in Figure .S2(B&C)). The average size of the Au NPs was 33.20 nm. In Figure 3.(E&F), the morphology of the Ag NPs was approximated to that of small spheres arranged around each other with a dimension of 20.02 nm. The FESEM images in Figure S3.(A-C) indicated the presence of spherical structures – formed by the aggregation of small spheres – stacked on top of each other, trapping deep pits between them, resembling wells with a larger area than the pits in the Au and Pt nanocatalyst particles. The TEM images shown in Figure 4.(A-C) reveal the following observations about the internal structure of the nanocatalyst particles: the Pt NPs sheets were rectangular polygons with small spheres in contact with the polygonal boundaries; the Au NPs were heterogeneous spheres with noticeable roughness near them; the Ag NPs were homogeneously spherical throughout their surfaces and had no other structures. The microscopic images (FESEM and TEM) were integrated for all the nanocatalyst particles (Pt NPs, Au NPs and Ag NPs). Complementing the XRD patterns (Figure 1.) and their indications of the purity of the nanocatalyst phases, the EDX spectra (see supplementary material file in Figure S4.(A-C)) and the percentage values of the constituent elements of these nanocatalysts revealed: (i) elemental signals of Pt, Au, and Ag atoms in the fabricated nanocatalyst particles are centered at absorption peaks at around 2.1 keV, 2.3 keV and 2.2 keV, respectively. A homogeneous distribution of each constituent element in the nanocatalyst particle sample was suggested (Figure S4.(A-C)). (ii) The accompanying reports in the inset table for each spectrum (Figure S4.(A-C)) also indicated that the particles of each nanocatalyst exhibited a dominant percentage of Pt in the Pt NPs, Au in the Au NPs, and Ag in the Ag NPs. The EDX spectra also showed other carbon signals due to a very small portion of “EG” remaining stuck on the surface of each nanocatalyst, or believed to be due to the adsorption of carbon dioxide gas on the nanocatalyst surfaces.
Figure .3 FESEM micrographs of (A,B) Pt NPs, (C&D) Au NPs and (E&F) Ag NPs at 1µm and 500 nmFigure .4 TEM images of (A) Pt NPs, (B) Au NPs and (C) Ag NPs at 100 nm
The three catalyst particle structures exhibited diverse nanoscale morphologies and face-centered cubic crystal structures, offering some distinct and promising physiochemical properties. Therefore, these distinct metallic nanocatalyst structures (Pt NPs, Au NPs and Ag NPs) were exploited for practical applications as photocatalysts for four types of contaminants (MB, Rh B, p-NP and TCB) under visible light in the presence of NaBH4. The UV-Vis. Spectra (see supplementary material file in Figures (S5-S8)) showed the characteristic absorption peaks of MB, RhB, p-NP and TCB at 664 nm, 554 nm, 410 nm and 265 nm, respectively, to monitor the photoreduction process for 3h at room temperature, compared to a blank solution of each contaminant at the concentration studied. For comparison, a series of photoreduction tests were also conducted at various concentrations (5, 10, 15 and 20 ppm) under visible light, also with NaBH4 and each nanocatalyst separately. As shown in Figures .5(A-D), the photoreduction tests demonstrated that the nanocatalyst particles differed in performance with each contaminant type and its concentration. The silver-based nanocatalyst particles “Ag NPs” had the highest photoreduction capacity at all contaminant concentrations and for each of the four contaminant types (Figure .4(A-D) & Figure S5.(A-D)). The mixed structure of small spheres and cauliflower buds of Pt NPs demonstrated greater catalytic activity than the large cactus buds against all contaminants (Figure .5(A-D). However, as shown in Figure 6.(A-D), the color of the RhB, MB, p-NP and TCB solutions rapidly changed from colored to colorless. The maximum absorbance of the contaminant solutions decreased significantly over the three-hours reaction time. It was clearly indicated that the photoreduction reaction was completed in three-hours, as shown in Figures.S5-S8 (see supplementary materials file). The slope of the absorbance decrease was significantly greater for the Ag NPs and Pt NPs when comparing the blank solution of each contaminant with the color contaminant solution after photocatalysis and compared to the Au NPs, indicating the excellent catalytic performance of the Ag NPs. It should be noted that the TCB solution was transparent, so that it is difficult to understand the color change that occurred (before and after the photoreduction reaction). However, Ag NPs and Pt NPs not only were more efficient at catalyzing both MB and Rh B than the other two contaminants at low concentrations, but the photoreduction reaction efficiency was also slightly reduced at high concentrations of the preceding contaminants. The colored polyaromatic contaminants (MB and Rh B) were catalyzed rapidly at low concentrations, while the monoaromatic contaminants (p-NP and TCB) were resistant to photocatalysis at both high and low concentrations. Furthermore, the photoreduction reaction of the nanocatalysts fabricated at a concentration of 10 ppm of each contaminant studied over five reuse cycles revealed excellent catalyst reuse rates (Figure .7(A-D)).
Figure 5. Yield variation curves of photoreduction reactions on the surface of nanocatalysts (Pt NPs, Au NPs and Ag NPs) fabricated for contaminants (A) MB, (B) Rh B, (C) p-NP and (D) TCB
Figure 6. Digital photos of color changes in contaminant solutions (at a concentration of 10 ppm (A) MB, (B) Rh B, (C) p-NP and (D) TCB) on the surface of the three nanocatalysts (Pt NPs, Au NPs and Ag NPs)Figure 7. Reuse yield curves of nanocatalysts (Pt NPs, Au NPs and Ag NPs) for contaminants (A) MB, (B) Rh B, (C) p-NP and (D) TCB
DISCUSSION
Currently, developments in the synthesis of nanomaterials based on noble metals, their alloys, corresponding composites, and their excellent ability to reinforce the surfaces of a large number of materials (such as polymers, naturally occurring materials, and oxides), have attracted the attention of environmental researchers. The unlimited chemical and physical properties of these materials greatly facilitate their application in environmental media treatments, chemical reactions, and other processes. To obtain monodisperse nanoparticles of these metals, various protocols were used to form spherical/polygonal/pyramidal/star-shaped particles through solvothermal/hydrothermal reactions, sonication, etc.. The above methods applied specific conditions for each method, and mixtures of organic solvents, especially N, N-dimethyl formaldehyde (DMF), were used to determine the optimal preparation parameters. In general, many of published papers did not pay attention to the green chemistry principles. However, today, a large number of researchers are keen to reduce the potential environmental risks resulting from noble metal nanoparticles preparation processes. Here, green chemistry has emerged in the fabrication process through two important aspects: the use of ultrasonication – as a green fabrication method – and the use of “EG” – as an environmentally friendly solvent -. EG is environmentally safe to the extent required. EG has exceptional properties, including an accelerating agent and morphological regulator, a gentle reducing agent for metal ions, a high boiling point, a medium-polar solvent, a relatively high dielectric coefficient, environmental compatibility, and a good stabilizer. It serves as an important component in the solvothermal method to create a homogeneous structure of metallic nanoparticles. Researchers also studied the mechanism of formation of metallic structures based on EG, and found that this substance acts as an active structure-forming agent and a reducing agent for metal ions. Refuting the formation mechanism is the cornerstone for understanding the crystallographic and morphological changes of the three nanocatalyst particles (Pt NPs, Au NPs and Ag NPs). The related mechanism of the primary particle units was based on (simple/strong) reduction reaction in two successive stages. The reduction reaction in its first stage (shown in Figure 8.) is characterized by the interaction of the metal ions “Mn+” individually (Mn+ = Pt4+, Au3+ and Ag+) with the reducer agent that resulted from the reduction of a portion of EG (23). In the first minutes of heating, the reduction reaction medium was enriched with the glycolaldehyde compound due to oxidation by aerobic oxygen. Its concentration in the solution increased until it reached a certain saturation limit. During the reaction, gradual changes in the glycolaldehyde concentration led to dispersion and an increase in the concentration of the primary nuclei of the related zero-valent metals (Pt(0), Au(0) and Ag(0)). The difference here was the reducing potential of each ion versus the reducing potential of the glycolaldehyde. The redox potentials relative to the hydrogen electrode were as follows: E(0) (PtCl(4-)/Pt) = +0.90 eV, E(0) (AuCl(-1)/Au(0)) = +1.002 eV, E(0) (Ag(+1)/Ag) = +0.791 eV and E(0) (ethylene glycol/glycolaldehyde) = 0.57 V (24). It is noted that the redox potential of the E(0) (ethylene glycol/ glycolaldehyde) is very suitable for a simple reduction reaction of the ions. Each of the formed nuclei had a definite crystal structure. However, because their crystals lack a final surface energy for their crystal facets, they did not assume the final crystalline form. At this stage, they were susceptible to morphological variations and instability. According to the explanations of several researchers (25) (26), the ultra-fine nuclei, each of which served as a precursor for the growth of another nucleus from the related particle. With continuous heating for four hours and constructive collisions between the nuclei, the stability of the ultra-fine particle nuclei was reduced through repeated dissolving, which likely led to recrystallization into larger, more energy-stable crystals (26). Because of reaching very high concentrations and achieving high reductive capacity of glycolaldehyde, prolonged heating with exposure to the largest possible amount of atmospheric oxygen was required. Thus, the second stage of reduction, which is the strong reduction stage under the conditions of the sonication probe method, was ensured by the formation of bubbles in the solution during sonication and their explosion. Both of which were accompanied by high temperatures. Water molecules in the crystalline framework of mineral salts break down, generating free radicals (HO● and H●). These free radicals are naturally very powerful oxidizers, attacking a portion of the EG molecules that have not been converted to glycolaldehyde, forming free radicals of EG. The abundance of these oxidizing free radicals led to further oxidation, producing a medium rich in reducing agents. In parallel with this reaction, the initial nuclei formed composed of the M(0) particles not only augment the activated surface to dissociate the air oxygen and accelerate the initial reaction, but also catalyze the self-reduction of the remained metal ions. Upon completion of the strong reduction stages of the metal ions, a number of intermediate phases emerged that were fruitful in producing the metal particles in their final crystalline forms. The formation of these phases was discussed by considering the functions assigned to each agent, namely: viscosity of the reducing medium, temperature, hydroxyl ions, and hydrazine. Initially, the metal salts dissolve in EG, similar to the dissolution of a metal salt in a weakly polar solvent. The salts quickly transformed into the acidic formulas “HAuO3̅2 and HPtO3̅” and of both Au and Pt, respectively, similar to what Pan, Karimadom and Fuentes-García reported (27) (28) (29). In this regard, the viscosity of the solution played an important role in the nucleation and reduction stages. The viscosity of the EG solution, having a value of 22 mPa s at 16 °C, decreased with increasing temperature, so that it can orientate the reduction reaction in several ways. First, the decrease in viscosity with increasing temperature enhanced the migration of metal ions in the solution, thereby accelerating and regulating the reduction reaction. In the same context, it also provided crystalline nuclei of M(0) particles at significant concentrations and quantities in the initial stages of the synthesis reaction. Indeed, this was required and important to ensure a favorable initial environment for the final nucleation process. Second, the viscosity of the EG solution implies the presence of two phases (aqueous + organic), which favors the formation of an inverse micelle system (30). In such systems, as discussed in Holade’s paper (31) and consistent with the formation mechanism, the dissolved intermediates of the metals in their ionic state were concentrated within the micelle droplet and surrounded by EG molecules. Then, there is no massive flooding of distorted primary nuclei, as what happened was that a large portion of the ions are protected from random reduction processes and restricted movement. The greatest fruition of this is drawn in the later stages of fabrication – the sonication stage -. After four hours of fabrication, the sonication process of the solution containing the micelle systems (EG/ionic forms of the metal components AuO33 ̅&PtO32 ̅) began. Free radicals, such as the (HO●, H● and HOCH2CH●OH), penetrated the bicontinuous phase (27) (28) (29) (31). This facilitates the separation of the EG layer – the outer micelle layer – from the ionic constituents of the mineral components – the inner micelle layer -. This caused of the acidic mineral components “HAuO3 and HPtO3” to directly collide with free radicals (HO●, H● and HOCH2CH●OH), generating hydroxyl-based intermediates “[Pt (OH)6]-1, Ag OH and [Au (OH)4]-1“, which is aligns with Kimberly’s proposals (32). According to the findings of Vasilchenko’s paper (33), adjusting the solution medium to become alkaline was valuable in the formation of complex precursors “[Pt (OH)6]-1, Ag OH and [Au (OH)4]-1” of structurally and thermodynamically stable noble metals. Reducing such metallic-hydroxide intermediate phase structures “[Pt (OH)6]-1, [Ag (OH)2] 1 ̅ and [Au (OH)4] 1 ̅” was easily generated stable zero-valent metal structures after their final reduction with hydrazine. The significant function of adjusting the pH value of the solution was also due to the fact that hydrazine’s reducing power increases in alkaline media (34). The good diffusion of micelles creates a steric effect between the particles, forming finely crystallized mineral nuclei for the target particles (Pt NPs, Au NPs and Ag NPs). This was useful for formulating deposits of non-aggregated noble metal particles (Pt NPs, Au NPs and Ag NPs) with a specific crystal structure and spherical or hybrid morphologies. It is inferred from the polyol-based mechanism, as reported in related studies (23)(35) (36), that the noble metal ion reduction and oxygen dissociation reactions proceed without hydroxyl ions, albeit at a very low rate. Regarding reverse micelles, it should be noted that EG undeniably provides a favorable environment for the formation of reverse micelles, similar to the state of reverse micelles, as if a surfactant were present. Due to the pronounced viscosity of EG and its high concentration relative to water droplets (available in mineral salts) at low concentrations, a similar water-in-oil system is formed. EG oxidation products (particularly the glycolaldehyde compound – produced by the oxidation-reduction reaction when the mineral ions are reduced-) play a similar role as surfactants, resulting in the formation of a relatively stable micelle structure (31) (37) (38). There are two experimental observations that led to suggest the formation of such two compounds: (i) Upon examining the pH value of the initial solution formed by the dissolution of the primary salts in EG, it was found to be 1. (ii) A slight change in the color of the resulting initial solutions (in the Au3+/EG solution from yellow to very dark gold, in the Pt4+/EG solution from intense orange to orange-brown, in the Ag+/EG solution from transparent to pale gray) after four hours of continuous stirring at 75°C. Regarding the literature on the possibility of forming such intermediate compounds (“HAuO32 ̅ and HPtO3̅“), the results of the extensive and clear thermodynamic study in Yuan’s paper on the phrase “gold-chloride-water” prove that acidic and oxidizing conditions provide suitable conditions for the formation of stable acid-oxygen-base complexes of gold (HAuO32̅). According to the same paper, these complexes are amphoteric in nature and tend to be highly acidic. Therefore, they are easily dissolved in alkaline media and are converted to Au(0). Yuan (39) elaborated in his discussion, particularly when studying the redox (E(0)-pH) curve, that there are a number of intermediate compounds with the formula (HAuO32̅, AuO33̅ and H2AuO3̅) that can form as intermediate phases in equilibrium with Au(OH)3 and combine with each other to favor the formation of Au(0) in an alkaline medium. Kurniawan (40) and Malhotra (41) confirmed through electrochemical studies that Au can form relatively stable acidic intermediates and convert to the more structurally stable hydroxide in highly alkaline media. The interaction between the prepared nanocatalysts allowed for a useful correlation between their morphological and crystallographic properties. All three nanocatalysts exhibited a high degree of crystallinity, good crystal orientation, and good crystallite size, and possessed the same crystalline system (FCC system). Repeated recrystallization of small-scale nuclei and their interphase fusion enhanced the metallic bonding in the crystal lattice of the single-cell crystal within the solid zones. The crystallization of irregular and ill-defined nuclei in the soft zones was inhibited during recrystallization. Large-scale orientation of the crystal structures of the nuclei was present in the solid zones. In essence, the formation of the ordered micro-crystalline phase of the target particle nuclei dominates the disordered microcrystalline phases. This is natural, as their formation rate is greater than or equal to the disintegration rate of the disordered microcrystalline phase of the target particle nuclei. The disordered microcrystalline phase disappeared spontaneously with the temperature change between the initial heating stage and the bubble bursting during the sonication stage. This is consistent with the Ostwald ripening principle. On the other hand, the stress alignment of the soft (crystalline and semi-crystalline) zones of the micelles exerted a torque force perpendicular to the strain direction, for growth along the primary crystal growth axis guided by the hard crystalline zones. The nature of these perpendicular forces, which were generated by the crystallized parts of the soft zones, reduces the deformation of the formed crystals. Furthermore, the excessive elongation of the EG chains within the micelle structure caused the solid zones of the small-scale nuclei to reorient and restructure along the crystal axis. This effect increased with the distribution and dissipation of stresses, encouraging the formation of a uniform crystalline system of nanocatalyst particles. Undeniably, the slightly crystalline micelles acted as large-strain dampers, meaning they reduced crystal distortion. This indicates that the solid zone reflections observed in the diffraction patterns, represented by the (1 1 1) peak, were due to the good alignment of crystals in these zones, which were held together by strong metallic bonds (42). This results in good crystallinity indices (degree of crystallinity and orientation). The results (Table 1) shows that the platinum-based catalyst particles have the lowest crystallinity indices compared to the other two catalyst particles (Au NPs and Ag NPs). This was attributed to the presence of two morphologies (spherical and sheet) (21). Sheets were rarely observed in Pt NPs (Figure 3(A&B)), but a logical explanation for this is that their primary nuclei grew in all directions and that platinum ions exhibited poor reductive capacity in EG (34). The polyol method, on the other hand, typically favors gold’s ion reduction reaction and the formation of more homogeneous and uniform morphologies. Silver is most responsive to the reduction of its ions in EG to Ag(0). The last observation worth noting is that the large crystallite size values of the Ag-nanocatalyst and Au-nanocatalyst explain the reason for the sharpness of the reflection peak, unlike the Pt-nanocatalyst. The {1 1 1} facet is the lowest-energy facet, and there is order in the formed polycrystals and low surface roughness, as in the case of the Ag-nanocatalyst. The {1 1 1} facet in Pt NPs is lower in energy than the {1 1 0} facet, resulting in a predominantly spherical morphology, which requires a more regular crystal structure than gold. In other words, the effect of both crystalline facets can be seen in Pt- and Au-based nanoparticles. In essence, the formation of the fine crystalline phase of the target particle nuclei dominates the amorphous microphases. This is natural, as the rate of their formation is greater than or equal to the rate of disintegration of the disordered crystalline phase of the target particle nucleus. The final phase disappears spontaneously with the temperature change between the initial heating stage and the bubble bursting during the sonication stage. This is consistent with the principle of estuarine maturation. As seen in Figure 5, the catalytic performance of the three particles can be ordered according to the type of nanocatalyst particle, the type of contaminant, and its concentration as follows:
It is perhaps important to establish a link between the laboratory photoreduction reaction results in the presence of NaBH4 and the structure of the three catalyst particles. The first link (crystal structure and photocatalysis) is that since crystals have a {1 1 1} facet, without mixtures with {1 1 0} facet, at the lowest possible energy, the catalytic sites can be controlled and encouraged to complete the catalytic reaction in the best possible way. Mixture of {1 1 1} and {1 1 0} crystal facets, such as Pt NPs and Au NPs, pose energetic barriers to catalysis because they lack the activation energy required to complete the reaction. Crystals originating from finer nucleation centers exhibit better reactivity in photophysical reactions, as in Ag-nanocatalysts first, followed by Pt-nanocatalysts. Such well-crystalized centers have the opportunity to interact with light and create electronic transitions and form free radicals (●OH and ●O2̄ ) that accelerate the catalytic reaction. Au-nanocatalyst, with a crystallinity between that of Pt-nanocatalyst and Ag-nanocatalyst, did not have the advantage of sufficiently interacting with light and producing free radicals (●OH and ●O2̄ ). The higher crystallinity degree of Ag NPs (60.29%) and lower crystallinity degree of Pt NPs (46.64%) and Au NPs (50.07%) means that the surface area of Ag NPs was increased, securing sites along their entire surface and utilizing them as sites for free radical oxidation. However, the effect of crystal crystallite sizes stems from their influence on the electronic state (particularly the conduction and valence bands). Scientific observations suggest that this generates energetically active intermediates and a huge number of photoactive agents, which further accelerates the photoreduction reaction (36)(43). The second link is (morphology and photocatalysis). Typically, the entire photoreduction reaction depends on the morphology of the particles or the prepared nanostructure. Ag-nanocatalyst particles, which yielded the best photoreduction performance, had a uniform and homogeneous spherical surface morphology. Spherical morphology, a type of zero-dimensional morphology, adsorbed electrons at the same rate in all dimensions (x, y, z) (45). As shown in the FESEM (Figure 3(E&F)) and TEM images (Figure 4(C)), the pits trapped between the spherical particles were small and deep, creating a morphologically impermeable internal surface (21). When light penetrates the surface of the impermeable structure, electrons – generated by the interaction of the Ag-nanocatalyst particle’s surface with visible light radiation – fall into the pits and become trapped (21). This accelerates the collision rate with the Ag NPs and creates a stream of the photoactive agents. The Ag spheres increase the adsorption order of BH4̄ on its surface and encourage rapid movement of the liberated hydrogen across the surface of this nanocatalyst. The last two observations highlight how morphological features, through their synergistic effects on the dispersion of photoactive species and facilitating hydrogen transfer, can contribute to improved photoreduction efficiency. However, the presence of a spherical structure in the Pt-nanocatalyst particles had an impact on the improved catalytic performance. While the presence of a lamellar structure did provide good catalytic performance for Pt-nanocatalyst particles, its catalytic performance differed slightly from that of Ag-nanocatalyst particles. The reason is the homogeneity of the surface morphology of the Ag-nanocatalyst. It appears that structural heterogeneity of the Pt-nanocatalyst reduced its catalytic performance. Regarding the sheet’s structure, the following observation can be conclusively concluded: the corner sites in the short, thin, polygonal sheets are more highly occupied than others, and exhibit very good selectivity for the adsorption of hydrogen liberated from BH4̄. Pt-nanocatalyst particles could have demonstrated better catalytic performance if they had a larger number of sheet sites, resulting in a higher edge-to-corner ratio (43), as reported in Zhou’s research. Similarly, Au-nanocatalyst particles with pits of different sizes (large and small) and an agglomerated polygonal structure deteriorated the catalytic performance. The catalytic performance of nanoporous noble metal catalysts (Pt/Au-nanocatalysts) is related to the compressive strain factor. Two types of pits can be detected in such structures (primary pits and secondary pits), according to Malekian (46), intertwined within the same Au-nanocatalyst morphology. The existing agglomeration and pits of different sizes can induce differential compressive strain and deformation in these pits. The deformation is large in large pit structures and small in small pit structures (46). Because large pits are more resilient to compression, the creation of large compressions by the agglomerates significantly reduces the pit size and, in turn, affects small pit to almost the same extent. It should also be noted that if the agglomerated structure itself is porous, its effect is different and separate from the compressive strain in noble metal structures (46). This topic will not be discussed in the current study because the Au-nanocatalyst agglomerated structure is not porous. Therefore, it is very likely that the compressive strain was not large and the adsorption energy was insufficient, which reduced the adsorption of liberated hydrogen and the generation of photoactive agents. This resulted in a reduction in the catalytic performance of the Au-nanocatalyst surface. Returning to the discussion of contaminant type, the two contaminants (MB and Rh B) were the easiest and fastest to photocatalyze compared to the two petroleum contaminants (p-NP and TCB). The colored polyaromatic heterocyclic contaminants (MB and Rh B), due to their π-electrons and (HOMO-LUMO) system, are able to transition to an excited state (MB* and Rh B*) upon collision with photons of light. When the excited states of MB* and Rh B* return to their ground states, a certain amount of energy is released. This energy is complemented by the energy released by photon collisions with individual nanocatalyst particles. Whereas, the two petroleum contaminants consist of a single homologue aromatic ring, making electronic excitation difficult. However, the presence of hydroxyl groups in p-NP compared to chlorine groups in TCB makes the phenol ring more active for the photoreduction reaction than in TCB. Finally, increasing the concentration of any contaminant caused a downward slope for the photoreduction reaction. A thicker and thicker layer of contaminant surrounded the nanocatalyst surface as its concentration increased. This increased layer reduced the penetration of light to conduct electronic excitation and the transfer of hydrogen liberated from the BH4¯ to the nanocatalyst layers (either Pt-nanocatalyst or Au-nanocatalyst), suggesting a lower photoreduction rate. Considering the above reasonings and the mechanism proposed by Shafiq (35), the proposed photoreduction mechanism was attributed to two complementary pathways: generation/transfer of photoactive agents, and hydrogen donor/movement. Initially, two components (contaminant molecules and BH4 ions) were adsorbed simultaneously. Here, the adsorption occurred due to a charge difference, as the components (MB, Rh B and BH4¯) dissolved in the aqueous medium are negatively charged, while the catalyst particles are positively charged. Furthermore, the characteristics of each component involved in the photocatalysis (NaBH4, contaminant, nanocatalyst) were, respectively: nucleophilic, electrophilic, surface-organized for hydrogen movement, and photoactive-generating. The catalyst surface interacted with photons to generate electron excitation from the ground band to the valence band. At the same time, the electrophile molecules were excited, generating a stream of electrons and holes. These can react with water molecules, destroying them and generating free radicals (●OH and ●H). The nanocatalyst surface was activated to regulate the presence and migration of hydrogen and deliver photoactive agents to the catalytic reaction medium. This system reflected the behavior of the nanocatalyst enhanced with NaBH4 against the four contaminants (23) (35) (36) (47) (48) (49) (50).
CONCLUSIONS AND RECOMMENDATIONS
In conclusion, a simple, green strategy is reported, detailing most of the steps involved in the fabrication of a set of three noble metal nanocatalysts for Ag NPs, Au NPs and Pt NPs. The nanocatalyst backbones were well-structured, pure, and shared diverse nano-spherical shapes, while the Au NPs and Pt NPs nanospheres exhibited polygonal, agglomerated morphologies and sheet morphologies. This strategy provided an efficient way for fabricating nanoparticles of the three noble metals, creating a strong bond between their structural and crystalline characteristics. Due to this bond, a deeper study was conducted on the catalytic reduction performance of these metal nanoparticles in the reduction reaction of four toxic contaminants (MB, Rh B, p-NP and TCB) at varying concentrations, under visible light irradiation and using NaBH4. The results demonstrated outstanding catalytic reduction behavior, excellent stability, and reusability of each nanocatalyst. After extensive discussion, this work revealed the design of the Ag-nanocatalyst for organic pollutant catalytic applications through rational structural integration of its nanoparticles. The Pt-nanocatalyst came in second place, followed by the Au-nanocatalyst. It is believed that the presented concepts should be applied to a wide range of applications by studying the following proposals: constructing other hybrid materials from these metallic particles, functionally modifying their surfaces with natural polymeric substrates, and exploring other green and sustainable methods for fabricating them with new structural specifications. It is also suggested to complement these studies by conducting analyses such as GC-MS, HPLC, and NMR, calculating environmental indicators to assess the toxicity of the resulting compounds and those released into the environment, such as POD, COD, and TOC, and estimating indicators specific to catalytic reactions, such as TOF and TON. It is also suggested to enhancethe catalytic functions of the fabricated nanoparticles so that they can be applied in the field of photodegradation.
The Internet has become an indispensable part of modern life, providing access to information on an unprecedented scale. However, this digital landscape also presents an increasing number of security risks, including the proliferation of malicious URLs, often hidden within emails, social media posts, and malicious website browsing experiences. When a user accidentally clicks on a malicious URL, it can cause a variety of damage to both the user and the organization. These URLs can redirect users to phishing sites that cybercriminals have carefully designed to look like legitimate sites, such as banks, online retailers, or government agencies. These phishing sites aim to trick users into voluntarily divulging sensitive information, including usernames, passwords, credit card numbers, Social Security numbers, and other important personal data, which can result in serious damage such as financial loss and the use of the data to defraud others (1). The continued development of phishing sites, which often use advanced social engineering techniques, increases the risk of exploiting users’ trust despite their security awareness training (2). Malicious URLs are one of the most common ways malware spreads. A single click on a malicious URL can trigger the download a installation of a wide range of malware, including viruses, Trojans, ransomware, spyware, and keyloggers without the user noticing (3). These malicious programs can compromise the user’s device, steal data, encrypt files and demand a ransom to decrypt them, monitor user activity, or even give attackers complete remote control over the device on which the malware is installed. Ultimately, this can cause financial, operational, or reputational damage to companies and organizations that hold user data (4). Traditional methods, such as blacklisting, fail to effectively identify newly emerging threats to detect malicious URLs, as these methods rely on pre-defined malicious URLs, leaving a gap in protection against unknown or newly created malicious links. Attackers are constantly working to circumvent blacklists by constantly creating new URLs and using techniques such as URL shortening and domain spoofing (creating domains that visually resemble legitimate ones) (5). Furthermore, attackers use sophisticated social engineering techniques, crafting convincing messages and deceptive links that exploit human psychology to lure users into clicking and effectively bypass many technical defenses (6). Whitelisting, an alternative approach to blacklisting where only pre-approved URLs are allowed, severely restricts user access and is often impractical for general Internet use. Machine learning has emerged as a powerful tool in the field of cybersecurity, providing more dynamic and efficient solutions to address these sophisticated threats by harnessing the power of data analysis. Machine learning algorithms can learn patterns and characteristics associated with malicious URLs, enabling them to accurately classify unknown URLs (7). Unlike traditional systems, machine learning models can adapt to new URL patterns and identify previously unseen threats, making them a critical component of proactive cybersecurity protection. Deep learning enhances this capability further by detecting subtle indicators of maliciousness (8) that traditional methods or even simple machine learning approaches may miss. Despite the accuracy that deep learning models may provide, they require more time in the detection process, prompting us to consider a way to combine the speed of traditional machine learning models with the accuracy of deep learning models (9,10). This leads us to explore ensemble models. This research focuses on exploring the effectiveness of machine learning and deep learning techniques for detecting malicious URLs, specifically investigating the potential of ensemble learning methods to enhance the accuracy and efficiency of detection. We aim to contribute to the advancement of cybersecurity by:
Analyzing the essential components and features of URLs: Extracting the essential lexical features that distinguish benign from malicious URLs. This will include a deep dive into the structural elements of URLs and an exploration of how features such as URL length, character distribution, presence of specific keywords, and domain characteristics can be used to identify potentially malicious URLs.
Investigating the performance of various classification algorithms: Discovering the most efficient models for URL classification. This will include a comparative analysis of different machine learning algorithms, including both traditional methods (e.g., support vector machines and naive Bayes) and more advanced deep learning methods (e.g., convolutional neural networks and recurrent neural networks). The goal is to identify the algorithms that are best suited to the specific task of detecting malicious URLs, taking into account factors such as accuracy and speed.
Proposing and testing ensemble learning techniques: Exploring the benefits of combining multiple models to improve accuracy and reduce training time. Ensemble techniques such as bagging and stacking offer the potential to leverage the strengths of different individual models, creating a more robust and accurate detection method overall.
This research specifically investigates the effectiveness of clustering techniques, especially bagging and stacking, in the context of detecting malicious URLs. First, we extract and analyze lexical features from the dataset, pre-process the data, and then compare the performance of several classification algorithms, including traditional machine learning models, deep learning, and ensemble learning. Finally, we evaluate the effectiveness of bagging and stacking techniques, highlighting their potential to enhance detection capabilities and reduce training and testing time, thus enhancing cybersecurity measures against malicious URL threats. Detecting malicious URLs has been an important focus of cybersecurity research, with many studies exploring a wide range of machine learning, deep learning, and ensemble methods. These efforts can be categorized based on the approach used to detect malicious URLs.
Machine Learning Classifiers
The basic approach involves applying traditional machine learning classifiers. Xuan et al. (11) investigated the use of support vector machines (SVM) and random forests (RF) to distinguish malicious URLs. Their dataset included 470,000 URLs, using an imbalanced dataset (400,000 benign and 70,000 malicious). While the random forest showed superior predictive effectiveness, the training time was quite long. However, the testing time was similar. Vardhan et al. (12) performed a comparative analysis of several supervised machine learning algorithms. These included naive Bayes, k-nearest neighbors (KNN), stochastic gradient descent, logistic regression, decision trees, and random forest. They used a dataset of 450,000 URLs obtained from Kaggle. Of these, the random forest consistently achieved the highest accuracy. However, a major limitation identified was the high computational cost associated with the random forest, which hinders its deployment in real-time applications. Awodiji (13) focused his research on mitigating threats such as malware, phishing, and spam by applying SVM, naive Bayes, decision trees, and random forests. For training and evaluation, he used the ISCX-URL-2016 dataset from the University of New Brunswick, known for its diverse representation of malicious URL types. The random forest algorithm achieved the best accuracy (98.8%), outperforming the other algorithms. However, the study lacks specific details regarding the training time and computational resource requirements of each algorithm, making it difficult to evaluate their overall efficiency. Velpula (14) proposed a random forest-based machine learning model that combined lexical, host-based, and content-based features. This approach leveraged a dataset from the University of California, Irvine, Machine Learning Repository containing 11,000 phishing URLs. The dataset was rich in features, including static features (e.g., domain age and URL length) and dynamic features (e.g., number of exemplars and external links). While the combination of diverse features significantly improved the model’s accuracy to 97%, the research did not explore the potential of other machine learning algorithms. Reyes-Dorta et al. (15) explored the relatively new field of quantum machine learning (QML) for detecting malicious URLs and compared its effectiveness with classical machine learning techniques. They used the “Hidden Phishing URL Dataset,” which included 185,180 URLs. Their results showed that traditional machine learning methods, especially SVM with Radial Basis Function (RBF) kernel, achieved high accuracy levels (above 90%). The research also highlighted the effectiveness of neural networks but noted that the current limitations of quantum hardware hinder the widespread application of QML in this field, making traditional machine learning models perform better due to their continuous improvement.
Deep Learning Models
Deep learning, with its ability to learn complex patterns from data, has emerged as a promising approach for detecting malicious URLs. Johnson et al. (16) conducted a comparative study of traditional machine learning algorithms (RF, C4.5, KNN) and deep learning models (GRU, LSTM, CNN). Their study confirmed the importance of lexical features for detecting malicious URLs, using the ISCX-URL-2016 dataset. The results indicated that the GRU (Gated Recurrent Unit) deep learning model outperformed the Random Forest algorithm. However, the researchers did not compare them with other machine learning and deep learning algorithms to explore whether they achieve better accuracy. Aljabri et al. (17) evaluated the performance of both machine learning models (Naive Bayes, Random Forest) and deep learning models (CNN, LSTM) in the context of detecting malicious URLs. The researchers used a large, imbalanced dataset obtained by web crawling with Mal Crawler. 1.2 million URLs were used for training, of which 27,253 were considered malicious, 1,172,747 were considered benign, and 0.364 million URLs were used for testing. The dataset was validated using Google’s Safe Browsing API. The results showed that the Naive Bayes model achieved the highest accuracy (96%). However, the study had limitations, including unexplored potential of other machine learning and deep learning algorithms, and uneven distribution within the dataset. These limitations may limit the generalizability of the results and potentially introduce bias into the model. Gopali et al. (18) proposed a new approach by treating URLs as sequences of symbols, enabling the application of deep learning algorithms designed for sequence processing, such as TCN (Temporal Convolutional Network), LSTM (Long Short-Term Memory), BILSTM (Bidirectional LSTM), and multi-head attention. The study specifically emphasized the important role of contextual features within URLs for effective phishing detection. Their results confirmed that the proposed deep learning models, particularly BILSTM and multi-head attention, were more accurate than other methods such as random forests. However, the study used a specialized dataset, limiting the generalizability of the results to other URL datasets, and did not comprehensively evaluate a broader range of other deep learning and machine learning algorithms.
Ensemble Learning Approaches
In addition to single classifiers, ensemble approaches, which combine multiple models, have been explored to improve detection performance. Chen et al. (19) leveraged the XGBoost algorithm, a boosting algorithm. Boosting is a popular ensemble learning technique known for its classification speed. Their work emphasized the importance of lexical features in identifying malicious URLs. Through feature selection, they initially identified 17 potentially important features, and then refined them to the nine best features to reduce model complexity while maintaining a high accuracy of 99.93%. However, the study did not provide a sufficiently detailed description of the required training time and computational resources consumed by the XGBoost model.
Feature Engineering and Selection
Recognizing the importance of feature quality to model performance, some research has focused specifically on feature engineering and selection techniques. Oshingbesan et al. (20) sought to improve malicious URL recognition by applying machine learning with a strong focus on feature engineering. Their approach involved the use of 78 lexical features, including hostname length, top-level domain, and the number of paths in a URL. Furthermore, they introduced new features called “benign score” and “malicious score,” derived using linguistic modeling techniques. The study evaluated ten different machine learning and deep learning models: KNN, random forest, decision trees, logistic regression, support vector machines (SVM), linear support vector machines (SVM), feed-forward neural networks (FFNN), naive Bayes, K-Means, and Gaussian mixture models (GMM). Although the K-Nearest Neighbor (KNN) algorithm achieved the highest accuracy, it suffers from significant drawbacks in terms of training and prediction time requirements. Mat Rani et al. (21) emphasized the critical role of selecting effective features for classifying malicious URLs. They used information acquisition and tree-shape techniques to improve the performance of machine learning models, particularly in the context of phishing site detection. The study used three classifiers: Naive Bayes, Random Forest, and XGBoost. Features selected using the tree-shape technique showed a significant positive impact on accuracy. While XGBoost achieved the highest accuracy of 98.59%, the study did not fully explore the potential of other deep learning algorithms or delve into aspects of model efficiency, such as their speed and resource requirements during the training and testing phases. Even though machine learning and deep learning methods have achieved high accuracy in identifying malicious URLs, there are concerns regarding training and prediction time efficiency and the complexity of tuning hyperparameters (11–21). Ensemble methods, such as Random Forest and XGBoost, are effective due to their ability to handle high-dimensional data, improve accuracy, and reduce the overfitting problem. However, they often require higher computational requirements (20), (21). Despite the great efforts made by researchers to detect malicious URLs, critical analysis reveals several points that need to be explored and require further attention.
Real-Time Applications: Most studies have focused on achieving high accuracy but do not delve into time efficiency, which is critical for detecting malicious URLs, especially in light of the rapid technological development. This limitation raises concerns about the feasibility of using these models in real-time applications (11–21).
Data Imbalance: Most research on datasets suffers from an imbalance between benign and malicious URLs (11–21). This imbalance significantly impacts model training and may bias the model’s performance in favor of the dominant class. Techniques such as over-sampling and under-sampling are needed to address this issue for more reliable evaluation.
Feature Extraction and Selection: Some research shows the need to explore how features are extracted, transformed, and selected effectively to improve training and prediction, efficiency, and accuracy(14,17,19–21).
MATERIALS AND METHODS
Hardware Specifications
The experiments in this research were conducted using Google Colaboratory (Colab) with virtual CPU settings to ensure methodological consistency. Colab operates on a dynamic resource allocation model, and the predominant configuration consists of an Intel Xeon processor with two virtual central processing units (vCPUs) and 13 GB of RAM. Acknowledging that there is potential for slight inter-session variations in resource assignment.
Dataset
The dataset (benign and malicious URLs) (22) used in this research consists of 632,508 rows with an equal distribution of 316,254 benign URLs and 316,254 malicious URLs, categorized according to the three columns, “url”, “label”, and “result”, which contain the URL itself, the corresponding classification label (either ‘Benign’ or ‘Malicious’), and the classification result as an integer value (0 for benign and 1 for malicious), We extracted a total of 27 lexical features from each URL as shown in Table 2.
Data Preprocessing
Data preprocessing is critical to achieving reliable and accurate results, as missing values and inconsistencies can introduce significant bias during the training process, leading to inaccurate predictions. Preprocessing steps,such as cleaning, integration, transformation, and normalization, improve model performance and prevent overfitting by ensuring data consistency and representation.
Data Cleaning
The missing values (NAN) and inconsistent data within the dataset are removed, ensuring its completeness and accuracy to train the model reliably. After the deletion process, the dataset became unbalanced. To overcome this problem and rebalance the dataset, the Random Under Sampling technique was used, where samples from the majority class were randomly deleted. The resulting balanced dataset was saved to complete other pre-processing steps on it. Figure 1 shows the balanced distribution of samples:
Fig 1. Distribution of URLs after applying the Random Under Sampling technique to balance the samples
Data Integrity
Maintaining a consistent data structure by standardizing column names and data formats requires ensuring the dataset does not contain duplicates or inconsistencies in the format of different attributes.
Data Transformation
Converting categorical features such as url_scheme and get_tld into a numeric format, which can be easily processed by various machine learning algorithms. This involved converting categorical variables into multiple numeric variables. The url_scheme feature was converted into four features, each representing a single protocol. We got four new features as shown in Table 3. The top-level domain feature (get_tld), which is a categorical feature, was converted using an ordinal encoder to be processed by the algorithms(23).
Data Normalization
Normalizing the data’s numeric attributes was conducted using appropriate techniques. Features such as url_length, path_length, and host length were normalized to achieve better model performance by equalizing the impact of these attributes, which differ in magnitude.
Feature Selection
Correlation-based feature selection was used to examine the relationship between features and the target variable. This method is characterized by its rapid feature selection while maintaining classification accuracy. The most influential features that had significant correlations with the target variable and small correlations between them were then selected to reduce redundancy and simplify model building (24,25). After selecting the features that were most correlated with the target variable [‘result’], thirteen independent variables (features) were selected, as shown in Figure 2. Figure 3 shows the data preparation process, illustrating all the steps taken to obtain a balanced dataset.
Fig 2. The lexical features most correlated to the dependent variable (result)
Fig 3. Data preparation process
Proposed Solution
This study proposes an innovative approach to detecting malicious URLs using ensemble learning techniques, specifically Bagging (Bootstrap aggregation) and stacking. Bagging (Bootstrap aggregation) uses 50 decision trees as its baseline models, yielding better results than using more or fewer trees, as shown in Tables 9 and 10. Majority voting is used to obtain the final predictions, as shown in Figure 4. While stacking uses models (AdaBoost, Random Forest, and XGBoost) as base models and uses a random forest as meta model to obtain the final predictions, as shown in Figure 5. These techniques combine predictions from multiple base learners, resulting in a faster and more accurate classification model. Bagging is a statistical procedure that creates multiple datasets by sampling the data with replacement to obtain a final prediction result with minimal variance(26). Stacking combines weak learners to create a strong learner. It combines heterogeneous parallel models to reduce bias in these models. Stacking is also known as stacked generalization. Similar to averaging, all models (weak learners) contribute based on their weights and performance, to build a new model on top of the others(27). Models (AdaBoost, Random Forest, and XGBoost) were used as weak learners to gain different perspectives on the dataset and avoid duplicate predictions.
Fig 4. Proposed bagging model Fig 5. Proposed stacking model
Verifying the Results
To analyze the effectiveness of the proposed solution extensively, a comparison of its performance with many traditional machine learning algorithms and deep learning techniques is applied. The algorithms that were implemented and evaluated include:
Traditional Machine Learning Algorithms
The machine learning algorithms evaluated included several with specific parameter settings. The Decision Tree was configured with random_state=42 for consistent results. Logistic Regression used max_iter=5000 and random_state=42. The SVM was trained with a linear kernel (kernel=’linear’) and random_state=42. Finally, K-Nearest Neighbors was set to consider 3 neighbors (n_neighbors=3). Gaussian Naive Bayes and Bernoulli Naive Bayes were used with default parameter settings without adjustments.
Deep Learning Algorithms
The deep learning models, CNN, FFNN and RNN are set using various parameters to adjust the model’s performance. The CNN has several convolutional layers and max-pooling layers, a flatten layer, and two Dense layers. The FFNN had set Adam as the model optimizer, has an initial learning rate of 0.001, each layer having different number of parameters. The FFNN had three Dense Layers. RNN has two Simple RNN layers and also uses Adam. Finally, the Radial Basis Function Network has set hidden_layer_sizes= (10), the maximum iterations are set to 1000 iterations.
Ensemble Learning Algorithms
The ensemble learning algorithms employed a variety of configurations to create robust predictive models. The initial Voting Classifier was set to use a hard voting strategy. The initial Stacking Classifier integrated a Decision Tree Classifier with random_state=42 as its final estimator, utilized all available cores (n_jobs=-1), and employed passthrough. Bagging Classifiers were configured with 50 base estimators (n_estimators=50), a Decision Tree Classifier (with the default random state) as the base estimator, a max_samples value of 0.80, specified bootstrap sampling, and a random_state of 42. AdaBoost used 100 estimators (n_estimators=100), a Decision Tree Classifier with max_depth=10 as the base estimator, a learning rate of 0.5, and random_state=42. The final Voting and Stacking classifiers were then set up in the same way. Gradient Boosting and Extra Trees utilized a fixed random_state.
Model Evaluation
To evaluate model performance, we employed a comprehensive set of metrics, including:
Confusion Matrix: Table 4 shows the confusion matrix, which compares predicted classifications to actual labels, revealing True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). This is crucial for binary classification tasks.
Accuracy: Overall correct predictions.
Precision: Correctly predicted malicious URLs out of all those predicted as malicious, where high precision minimizes false positives.
Recall: Correctly identified malicious URLs out of all actual malicious URLs, where high recall minimizes false negatives.
Specificity: Correctly identified benign URLs out of all actual benign URLs.
F1 Score: Harmonic mean of precision and recall.
Training and Prediction Time
Training Time: Training time measures the time taken to train each model on the training data, providing insights into the efficiency and scalability of different learning algorithms.
Prediction Time: Prediction time quantifies the time required for each trained model to predict the classification of a new URL and assesses the model’s suitability for online URL filtering applications that require fast responses to incoming URLs, impacting the model’s applicability in real-time systems. In this paper, we calculated the above metrics for each of the algorithms considered in our research, resulting in a comparative performance analysis that reported on the selection of the optimal model.
RESULT
This section presents the results obtained from the implemented algorithms, discusses their performance, and compares their strengths and limitations. To evaluate the models, we focus on accuracy, precision, recall, specificity, F1 score, training time, and prediction time for each model to provide a comprehensive analysis of their effectiveness in detecting malicious URLs.
Individual Machine Learning Models
Tables 5 and 6 summarize the performance of six common machine learning algorithms, namely K-Nearest Neighbors (KNN), Decision Tree, Logistic Regression, Support Vector Classifier (SVC), Gaussian Naive Bayes, and Bernoulli Naive Bayes, evaluated based on several key metrics to provide a clear picture of their performance in classifying URLs as benign or malicious. Based on the results of practical experiments on individual machine-learning models, we summarize the following:
The K-Nearest Neighbors (KNN) model achieved the highest accuracy, reaching 98.94%, while the Bernoulli Naive Bayes (Bernoulli NB) model exhibited the lowest accuracy at 96.27%. Drilling down into individual metrics, Bernoulli NB demonstrated exceptional precision of 0.999, effectively identifying benign URLs. However, the Decision Tree model excelled in recall 0.985, successfully identifying malicious URLs. Bernoulli NB also showed the best specificity. Finally, KNN displayed the best-balanced performance, as measured by the F1 score, which considers both precision and recall. The models also varied significantly in terms of speed. Bernoulli NB was the quickest in training at 0.126 seconds, whereas the Support Vector Machine (SVC) model required substantially more time 17 minutes, possibly due to the size of the dataset used for training and model optimization. For prediction, Logistic Regression outperformed all others, whereas KNN had the longest prediction times. These results illustrate a crucial tradeoff between computational efficiency and predictive power, where simple and easily trained models require less computational overhead, whilst algorithms that model complex non-linear patterns typically require a considerably greater level of computing time.
Deep Learning Models
Table 7 presents the performance metrics for four prominent deep learning models FFNN (Feed Forward Neural Network), CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), and RBFN (Radial Basis Function Network), while Table 8 shows the time each model took to train and predict. Based on the results of practical experiments on deep learning models, we concluded the following: The Feed-Forward Neural Network (FFNN) achieved the highest accuracy at 98.64%, while the Radial Basis Function Network (RBFN) had the lowest accuracy at 98.52%. While the accuracy differences were small, other metrics showed some variation; the Convolutional Neural Network (CNN) showed the highest precision and specificity, indicating its ability to correctly identify both benign and malicious URLs, whereas the Recurrent Neural Network (RNN) achieved the highest recall, showing high effectiveness in capturing actual malicious URLs, despite it only showing the second highest accuracy level. Ultimately, the FFNN model exhibited the highest F1 score. Regarding speed, the RBFN model proved to be the most computationally efficient in terms of training time, completing the training process in 33.189 seconds, compared to the CNN, which required over 10 times longer. It is noteworthy to remember the significantly increased computational power required for the CNN. Furthermore, RBFN was also the fastest model for making predictions.
Ensemble Learning Models
Tables 9 and 10 show the performance of twelve ensemble learning models using their three techniques (bagging, stacking, and boosting). Regarding experiments on ensemble learning models, we note the following:
Bagging (bootstrap) is the top performer in terms of accuracy, reaching 99.01%. At the other end of the spectrum, the stacking model combining Decision Trees, Logistic Regression, and Naive Bayes showed the lowest accuracy. Examining other key metrics, the Voting model incorporating Adaboost, Random Forest, and XGBoost models achieved both the highest precision and specificity. Interestingly, Bagging (pasting), a variation of the Bagging algorithm, demonstrated the highest recall. For the best-balanced performance, reflected in the F1-score, stacking combining Adaboost, Random Forest, and XGBoost produced the highest F1-score. The speed varied substantially across the different ensemble techniques investigated. In the training process, the individual XGBoost model was significantly faster. In contrast, the Stacking model incorporating Adaboost, Random Forest, and XGBoost, was by far the slowest to train. For prediction speed, the stacking model (Decision Trees, Logistic Regression, Naive Bayes) demonstrated speed during prediction. The slowest prediction time, unsurprisingly, was seen with Stacking (Adaboost, Random Forest, and XGBoost), confirming that the complexity incurred through higher-level models impacts both training and testing times within the model. A comprehensive performance evaluation of all models highlights notable differences in strengths and weaknesses. Figure 6 provides a comparison of the overall accuracy achieved by each model, while Figures 7, 8, 9, and 10 visualize other critical metrics of model evaluation: precision, recall, specificity, and F1 score, respectively.
Fig 6. Accuracy comparison for all models Fig 7. Precision of all models Fig 8. Recall of all models Fig 9. Specificity of all models Fig 10. F1-Score of all models
DISCUSSION
These findings demonstrate a significant correlation between the characteristics of URLs and their likelihood of being designated as malicious, underscoring the necessity of precise feature extraction for the efficacious identification of malevolent URLs. Furthermore, using well-preprocessed datasets leads to accurate classification results. Moreover, the precision and efficiency of the model in terms of classification or prediction are contingent upon the type and integrity of the data utilized. The selection of an appropriate model pertinent to the specific issue at hand is of paramount importance, as the correct model selection fosters accurate classifications and predictions at a high rate, resulting in the development of a reliable classifier. The results for traditional Machine Learning algorithms showed moderate accuracy. Most Machine Learning models, such as Logistic Regression and Naive Bayes, underperformed the proposed Ensemble models. This may be attributed to limitations such as overfitting or feature dependency in individual Machine Learning algorithms. The high accuracy achieved by deep learning models stems from their ability to handle intricate relationships within the data, although the computation costs involved with training these complex models can be considerable. However, Ensemble Learning techniques consistently outperformed both individual Machine Learning and Deep Learning techniques. In particular, bagging with bootstrap sampling (Bagging (Bootstrap)) consistently exhibited exceptional accuracy while minimizing training and prediction times. The highest accuracy achieved was with Bagging (Bootstrap), which obtained 99.01%. This suggests that Bagging is the optimal approach for a real-time, practical system for malicious URL detection. Stacking demonstrated similar performance levels with slightly extended training durations due to its reliance on a structure consisting of several models. The proposed solution resulted in the following benefits of ensemble learning: Improved accuracy: By combining multiple models, ensemble learning often achieves significantly higher accuracy than individual learners. This is because each model learns from different aspects of the data, thus reducing bias and variance. Several studies highlight the advantage of ensemble methods, including “Bagging Predictors” by Breiman (28), which shows significant improvement in accuracy and reduced variance compared to individual learners. Improved generalization: Ensemble learning often produces more robust models with improved generalization to unseen data, which helps mitigate overfitting. The article “Stacked Generalization” by Wolpert (27) improved the generalization ability of ensemble techniques, leading to better model performance on unseen data. Robustness to Noise and Outliers: Ensemble learning tends to be less sensitive to noise and outliers in the data, which increases model stability. The paper on “XGBoost: A scalable tree boosting system” by Chen and Guestrin (29) emphasizes XGBoost’s robust handling of noise and outliers, which contributes to overall model stability. Increased Stability: By calculating average predictions from multiple models, ensemble learning generally produces more consistent results than individual models, reducing variability in performance. Work on “Bagging Predictors” by Breiman (28) highlights how Bagging improves consistency by combining multiple predictions, reducing variability, and making models more stable. Reduce Complexity: While ensemble models may seem complex, they may sometimes simplify the learning process, especially when compared to complex deep learning, providing a better balance between accuracy and complexity. Some studies, such as “highly random trees” by Geurts et al. (30), have noted this advantage. Overall, these results, shown in Figure 11, strongly support the hypothesis that ensemble learning, and in particular Bagging (Bootstrap), is an effective technique for accurately detecting malicious URLs. It surpasses traditional machine learning algorithms in accuracy and performance, and demonstrates more favorable trade-offs between accuracy, computational complexity, and speed when compared to Deep Learning models.
Fig 11. Comparison of Accuracy, Training Time, and Prediction Time for all tested models.
CONCLUSION AND RECOMMENDATION
This study conducted a systematic evaluation of a range of machine learning, deep learning, and ensemble learning techniques for the purpose of detecting malicious URLs. Feature selection was employed, prioritizing those exhibiting the strongest correlation with the dependent variable, resulting in the selection of 13 lexical features from a total of 27 extracted from the dataset. The results demonstrate the superior performance of ensemble learning methods, specifically the Bagging (Bootstrap) technique, in achieving high accuracy alongside rapid training and prediction capabilities. This approach surpassed the accuracy of individual models and the speed of deep learning models, underscoring its effectiveness in mitigating the growing cybersecurity threat posed by malicious URLs. The speed and accuracy of the Bagging (Bootstrap) make it very useful for cybersecurity. It could be a strong tool in real-time systems for detecting and blocking threats.
The rapid growth of online social networks (OSNs) has drastically transformed the dynamics of information sharing and public discourse. Platforms such as Twitter, Reddit, and Facebook enable users to produce, share, and react to information in real time, leading to large-scale information cascades that can influence opinions, shape behaviors, and even affect societal outcomes [1]. Understanding how information diffuses through these networks is essential for multiple domains, including public health, political communication, marketing, and misinformation detection [2,3]. Early research on information diffusion relied primarily on epidemic-inspired models, such as the Independent Cascade (IC) and Linear Threshold (LT) models [4,5], which conceptualize the spread of information analogously to disease transmission. While these models are intuitive and computationally efficient, they often fail to incorporate key social and contextual factors that influence user behavior. To address these limitations, researchers have turned to optimization-based models, particularly those inspired by swarm intelligence. Algorithms such as Particle Swarm Optimization (PSO) [6], Ant Colony Optimization (ACO) [7], and Firefly Algorithm (FA) [8] have been used to simulate diffusion dynamics, optimize influence maximization, and improve prediction of cascade growth. These algorithms offer flexibility and adaptability; however, most implementations simplify the network context by treating nodes and content as homogeneous, thus failing to represent realistic user interactions. Several recent studies have proposed enhancements to these algorithms. For instance, Hsu et al. [9] introduced a hybrid ACO-GWO model for influence prediction, and Zhang et al. [10] incorporated topic modeling into diffusion forecasting. Yet, the inclusion of behavioral, temporal, and semantic features in metaheuristic-based diffusion models remains limited. Most current models neglect how content type (e.g., image vs. text), user engagement metrics (e.g., likes, shares), or posting time can dramatically alter the trajectory of information spread. Another key limitation is the lack of personalized or socially-aware modeling. Research by Huang et al. [11] showed that user credibility and influence scores play a significant role in the virality of information, yet such attributes are rarely encoded in swarm-based models. In addition, few studies have conducted a detailed sensitivity analysis to quantify the individual contribution of each factor to diffusion performance. In our previous work [12], we introduced a Modified Firefly Algorithm (MFA) for modeling information diffusion. While the model demonstrated competitive accuracy compared to traditional methods, it assumed uniform content behavior and excluded temporal or social user attributes. In this work, we propose an Extended Modified Firefly Algorithm (EMFA) that incorporates four critical dimensions — content type, engagement level, temporal dynamics, and user social attributes — into the diffusion modeling process. Building on our earlier MFA framework, we enhance the algorithm by embedding feature-aware adaptation strategies that respond to real-time user behavior and content variations. The integration of semantic, temporal, and social factors enables more accurate and interpretable predictions of how and when information spreads across a network. We evaluate the proposed model using real-world datasets from Twitter and Reddit, and benchmark it against leading metaheuristic-based diffusion models. The results demonstrate that the EMFA significantly outperforms baseline models in terms of prediction accuracy, diffusion realism, and sensitivity to external factors. Our contributions are threefold:
We develop an extended MFA model that integrates content, engagement, time, and user features into the diffusion process.
We conduct large-scale experiments using multi-platform datasets and compare results with state-of-the-art algorithms.
We analyze the sensitivity and robustness of each added factor, offering insights into the individual and combined effects on diffusion dynamics.
MATERIAL AND METHODS
Dataset Description
To evaluate the performance of the proposed Extended Modified Firefly Algorithm (EMFA), we utilized two real-world datasets:
Twitter Dataset: Extracted from the COVID-19 open research dataset (CORD-19) and filtered to include viral tweets related to health misinformation. The dataset includes tweet content, timestamps, engagement metrics (likes, retweets, replies), and user metadata (follower count, verification status, influence score).
Reddit Dataset: Sourced from multiple subreddits covering news and technology, capturing thread posts and comment cascades. Each record contains the post type (text/image/video), temporal metadata, and user engagement indicators (upvotes, replies).
All datasets were anonymized and preprocessed to remove bots and inactive users, normalize timestamps, and standardize engagement metrics.
Feature Engineering
We integrated four key dimensions into the simulation:
Content Type: Each item was categorized as text, image, video, or link-based. A semantic relevance score was assigned using a transformer-based language model (e.g., BERT) to capture inherent virality potential.
Engagement Metrics: We aggregated likes, shares, comments (Twitter) and upvotes, replies (Reddit) into a normalized engagement intensity score, which dynamically influenced the firefly brightness during the simulation.
Temporal Dynamics: Time of day, recency of post, and frequency of exposure were used to create a temporal weight function, adjusting node sensitivity over time.
User Social Attributes: For each user, we computed an influence score based on follower count, activity rate, and past cascade participation, and a trust score derived from content veracity metrics.
These features were embedded into the firefly movement logic to create context-sensitive swarm behavior.
Extended Modified Firefly Algorithm (EMFA)
We extended the classical Firefly Algorithm (FA) by incorporating semantic, temporal, engagement, and user-level features into the simulation of information diffusion in OSNs. The EMFA consists of the following key components:
Brightness Function
The brightness Ii of a firefly i, which reflects its attractiveness to others, is defined as a weighted composite of four dimensionwhere:
Ci : Content virality score derived from semantic classification (e.g., image vs. text).
Ei : Engagement score normalized from likes, shares, and upvotes.
Ti : Temporal relevance based on post recency and activity burst.
Si: Social trust and influence score of the user.
α, β, γ, δ: Tunable weights (hyperparameters) for each factor, summing to 1.
These weights are selected via grid search to optimize diffusion accuracy over validation data.
Distance Function
To measure the similarity or proximity between fireflies i and j, we use a hybrid function:where:
ContentSim: Cosine similarity between content vectors.
UserSim: Normalized difference in social features (e.g., follower count, credibility).
TimeDecay(ti,tj): A decay function emphasizing temporal proximity.
θ1+θ2+θ3=1: Feature similarity weights.
Movement Rule
The movement of a firefly i towards a more attractive firefly j is governed by:where:
xit : Position of firefly i at iteration t, representing its current diffusion vector.
β0: Base attractiveness.
λ: Light absorption coefficient controlling decay of influence over distance.
In this formulation, fireflies (representing posts or users) with higher brightness attract others, and the movement simulates information flowing through a network based on both attractiveness and proximity.
Cascade Termination
Diffusion halts when one of the following conditions is met:
The maximum number of iterations is reached.
The change in global brightness is below a defined threshold (ΔI<ϵmin ).
No firefly finds a brighter neighbor for a specified number of steps.
Simulation Environment
Platform: All experiments were conducted in Python 3.11 using the DEAP framework for evolutionary computation.
Hardware: Simulations were performed on a personal computer (Intel Core i7, 16GB RAM), with efficient code optimization.
Repetition: Each diffusion simulation was repeated 30 times to mitigate stochastic variability, and the average values were used for evaluation.
Evaluation Metrics
We evaluated the model based on the following metrics:
Prediction Accuracy: Comparing predicted cascade sizes and shapes to actual data.
Diffusion Depth and Breadth: Number of layers and maximum nodes reached.
Time to Peak Engagement: Temporal alignment with real cascade peaks.
Sensitivity Analysis: Ablation tests by disabling each feature dimension to assess its impact on model performance.
RESULTS
Quantitative Evaluation
We evaluated the performance of the Extended Modified Firefly Algorithm (EMFA) against three baseline models: the Independent Cascade (IC), the Particle Swarm Optimization (PSO), and the original Modified Firefly Algorithm (MFA). The models were tested across two datasets (Twitter and Reddit) using three standard metrics:
Prediction Accuracy (F1-Score)
Cascade Size Error (CSE)
Diffusion Root Mean Square Error (dRMSE)
Table 1 presents the three metrics:
These results show that EMFA significantly improves the predictive performance and realism of simulated cascades across platforms. The enhancement is consistent and robust, particularly under dynamic engagement and temporal variance scenarios.
Diffusion Pattern Visualization
To qualitatively assess the realism of the simulated diffusion, we visualized the cascades generated by EMFA and other models for a high-impact tweet and a Reddit post. Figure 1. Visualization of diffusion trees for the same post using MFA and EMFA. EMFA exhibits more realistic branching and temporal density, aligning closely with actual observed cascades.
Figure 1. Comparison of diffusion trees.
Feature Sensitivity Analysis
To understand the contribution of each added dimension, we conducted an ablation study where the EMFA was tested with each feature (content, engagement, time, social) removed in turn. Table 2. Sensitivity of EMFA to each individual feature class. Social and temporal information contribute the most to diffusion accuracy.
Platform Generalizability
We tested EMFA across different content categories (news, memes, opinion threads) and platforms (Twitter, Reddit), confirming that the model maintains strong performance despite structural and semantic differences in the networks.
Case Study 2: Political Discourse Propagation on Reddit
To further assess the adaptability of the EMFA model, we examined a political content cascade on Reddit. The chosen post, published on the r/politics subreddit during a national election period, presented a controversial opinion regarding campaign funding transparency. It sparked intense engagement, including thousands of upvotes, comments, and cross-posts to other subreddits.
Data Acquisition and Feature Mapping
Using the Reddit API (PRAW), we extracted:
Original post and comment threads
User metadata (karma, posting frequency, subreddit activity)
Content features (textual sentiment, controversy score)
These were normalized and encoded for integration into the EMFA framework:
Cascade Modeling on Reddit
Reddit’s tree-structured discussion format required adapting the EMFA’s spatial modeling. Each node (comment or post) was treated as a potential “information carrier,” with firefly movement simulated based on content relevance and engagement affinity. Figure 2. Actual vs. simulated Reddit thread trees using EMFA. The model successfully replicated the nested depth and engagement intensity around polarizing comments.
Figure 2. Actual vs. simulated Reddit thread trees.
Performance Comparison
Figure 3. EMFA achieved higher alignment with Reddit’s actual user flow and comment emphasis, indicating its versatility in hierarchical platforms.
Figure 3. EMFA alignment with reddit’s.
Feature Sensitivity Analysis
In contrast to Twitter, where temporal features were more dominant, Reddit propagation was more influenced by:
Engagement polarity (i.e., the presence of both upvotes and downvotes, signaling controversy)
Social positioning of users (karma, posting history)
Thread entropy (variability in comment sentiments)
This shows that platform architecture significantly modulates which features are most impactful, a dynamic that is effectively captured by EMFA.
Practical Implications
By accurately modeling Reddit thread evolution, EMFA can be used to:
Forecast thread virality
Detect potential misinformation or polarizing discourse early
Identify influential users in subreddit dynamics
DISCUSSION
The findings from our experimental and case-based evaluations reveal that the Extended Modified Firefly Algorithm (EMFA) significantly enhances the modeling of information diffusion in online social networks (OSNs). By incorporating four critical dimensions—content type, engagement metrics, temporal dynamics, and user social attributes—the EMFA delivers a more realistic and adaptive simulation of how information propagates across diverse platforms.
Comparative Analysis with Recent Studies
Our results align with recent research that emphasizes the necessity of multi-dimensional modeling for capturing real-world diffusion dynamics. For example:
Zhang et al. [10] demonstrated that integrating topic semantics and content type into diffusion models improves virality prediction, especially on platforms such as Reddit and TikTok.
Xu et al. [13] highlighted the temporal sensitivity of viral content, showing that early momentum plays a decisive role in shaping information cascades—this is consistent with our findings in the Twitter case study.
Zhang et al. [9] introduced a hybrid swarm intelligence model that accounts for user influence scores but did not address temporal or content-based adaptation, limiting their model’s generalizability across platforms.
Chi-I H et al. [14] investigated the role of engagement patterns in viral diffusion but relied on static social features, whereas EMFA adapts dynamically based on user behavior and time-series variations.
These comparisons underline how EMFA builds upon and extends current research by offering an integrated and adaptive framework that responds to both user and platform contexts in real time.
Case Study Comparison and Implications
Table 3 summarizes the key differences observed between the Twitter and Reddit case studies. The EMFA was able to flexibly adapt to platform-specific characteristics—broad, flat cascades on Twitter and deep, threaded discussions on Reddit—demonstrating robustness across structurally distinct networks.
The model’s sensitivity analysis revealed that temporal and social features dominate in broadcast-centric platforms, while engagement and semantic variability are more critical in discussion-based platforms. These insights suggest that one-size-fits-all diffusion models are inadequate for today’s diverse and evolving digital ecosystems.
Theoretical Contributions and Practical Value
By integrating behavioral, structural, and contextual features, the EMFA contributes to a growing class of hybrid diffusion models that combine bio-inspired computation with social theory. Unlike prior models, which are often rigid and hard-coded, the EMFA learns from the environment and adjusts its influence-matching heuristics, making it suitable for tasks such as:
Real-time viral content prediction
Campaign optimization and seeding strategies
Early warning systems for misinformation or disinformation trends
LIMITATIONS AND FUTURE DIRECTIONS
Although the EMFA shows promising generalizability, limitations exist. Notably:
Sentiment dynamics and emotional tone were not modeled explicitly, despite their known impact on content virality.
The static nature of the underlying social graph may overlook structural changes such as community migration or influencer emergence.
Future work should explore temporal graph evolution, multimodal content modeling, and real-time feedback mechanisms, possibly through reinforcement learning frameworks. Cross-platform transfer learning could also enhance EMFA’s applicability in hybrid environments.
CONCLUSIONS AND RECOMMENDATIONS
This study presents an enhanced computational model for simulating information diffusion in online social networks (OSNs), integrating four critical dimensions: content type, engagement level, temporal dynamics, and user social attributes. By extending the Modified Firefly Algorithm (MFA) into a semantically and socially aware framework (EMFA), we significantly improved the realism and accuracy of diffusion modeling across platforms such as Twitter and Reddit. The experimental results demonstrate that incorporating contextual and behavioral factors enables the model to better capture real-world diffusion dynamics, outperforming baseline metaheuristic algorithms. The model also exhibited high adaptability to platform-specific characteristics, suggesting its potential for generalization across various social media ecosystems.
RECOMMENDATIONS FOR FUTURE WORK
Platform Expansion: Future studies should test the model on additional OSNs such as TikTok or LinkedIn to assess adaptability across different user interaction paradigms and content modalities.
Real-Time Prediction: Integrating real-time data streams could transform EMFA into a predictive engine capable of early warning for viral misinformation or emerging trends.
Explainability Enhancement: While the current model improves accuracy, adding explainable AI (XAI) components could aid in interpreting how and why certain features drive diffusion, especially in sensitive applications such as public health or crisis response.
Integration with Intervention Strategies: The EMFA framework could be extended to simulate and evaluate the effectiveness of interventions (e.g., fact-checking prompts, content throttling) in slowing down the spread of harmful or false information.
In conclusion, the proposed EMFA model offers a flexible, extensible, and accurate framework for studying digital information dynamics, supporting both theoretical advancement and practical applications in network science, marketing, and information integrity.
Scoliosis is a complex, three-dimensional spinal deformity that necessitates precise evaluation and tailored therapeutic planning to ensure effective management [1]. Early assessment is critical for determining treatment options, including monitoring, bracing, or surgical intervention [2]. Advances in technology have positioned mobile applications as powerful tools for enhancing healthcare delivery through innovative diagnostic and monitoring solutions [3]. The Scoliosis Assessment Aid (SAA) emerges as a notable example, offering a free, evidence-based platform for scoliosis evaluation on Google Play. This study provides a rigorous analysis of the SAA, focusing on its functionalities, scientific underpinnings, and potential to improve scoliosis care, supported by contemporary academic references. The Scoliosis Assessment Aid (SAA) stands as a pivotal tool in advancing scoliosis care, offering a scalable solution that aligns with global clinical standards. Its ability to standardize assessments and reduce decision-making time enhances its utility in many different healthcare settings. By empowering clinicians and patients, the SAA fosters proactive management, particularly in resource-constrained regions. As digital health continues to evolve, the SAA exemplifies how technology can improve diagnostic precision, patient outcomes, and quality of life, setting a benchmark for future innovations in spinal deformity management.
MATERIALS AND METHODS
Functionalities of SAA
The SAA app features a user-friendly interface that enables clinicians and patients to perform preliminary scoliosis assessments efficiently. Aligned with SOSORT guidelines included Cobb angle, age and gender/sex [4], to evaluate scoliosis cases, the app ensures evaluations adhere to global clinical standards [5]. It provides a structured framework for treatment recommendations based on curvature severity (Cobb angle) and patient age. Users Input data such as age, sex, Cobb angle, and Adams Forward Bend Test results to generate immediate treatment recommendations. This feature reduces clinical decision-making time and enhances diagnostic accuracy.
Scientific Foundations of SAA
The app integrates validated diagnostic methods, including Cobb angle measurement and the Adams Forward Bend Test [7]. The SAA aligns with the 2011 SOSORT guidelines (Tables 1 and 2), which provide evidence-based recommendations for orthopedic and rehabilitative management during growth phases [8]. By bridging academic research with clinical practice, SAA enhances its credibility as a reliable tool for scoliosis management [9].
Ob = Observe (with frequency in months: Ob3 = every 3 months, Ob6 = every 6 months, Ob8 = every 8 months, Ob12 = every 12 months).
SSB = Soft Shell Bracing.
PTRB = Part Time Rigid Bracing.
FTRB = Full Time Rigid Bracing.
PSE = Physiothérapeute Specific Exercices.
Su = Surgery.
Technical Enhancements
Clarification of Mathematical and Statistical Algorithms
To enhance scientific transparency, the mathematical mechanisms used in the app were detailed, including:
Treatment recommendation model based on Cobb angle:
The treatment recommendation is determined as follows:
Periodic Monitoring if the Cobb angle (θ Cobb) is between 10∘ and 25∘ (inclusive), and Bracing if the Cobb angle (θ Cobb) exceeds 25∘.
Here, θ Cobb represents the Cobb angle measured by the app (Table 4).
Bootstrap technique for confidence interval estimation:
The bootstrap estimate θ^∗ is calculated as:
where B=1000 resampling iterations are performed, and θ^b denotes the estimate from the b-th sample (Table 5).
Strengthening Statistical Analysis
Shapiro-Wilk test for normality validation
The test statistic W is computed as:
Where: a=0.05
Effect Size (Cohen’s d): Cohen’s d is calculated using:
where the pooled standard deviation spooled is:
Comparison with Existing Tools
A systematic comparison between the SAA and the Scoliometer app was conducted to evaluate competitive features (Table 3).
Detailed Statistical Tables
Technical Documentation
The app operates using a systematic mechanism based on the SOSORT clinical guidelines. Upon entering data (such as age, sex, Cobb angle, and Risser’s sign), the algorithm compares these values with pre-defined thresholds derived from scientific evidence. For example: If the Cobb angle is between 10 and 25 degrees for adolescents, periodic monitoring is recommended. If it exceeds 25 degrees, brace use is recommended according to guidelines. The data is processed through a decision tree model that combines age, curvature severity, and other factors to generate recommendations. To help users understand the results, the app provides alerts indicating the need to confirm the results with a specialist in cases of critical or unclear values.
Statistical Analysis
To ensure the accuracy of statistical evaluations in assessing the efficacy of the “Scoliosis Assessment Aid (SAA)” app, advanced methodological approaches were implemented to address data quality and modeling challenges. First, to mitigate data scarcity in clinical samples, the Bootstrap technique was applied with 1,000 resampling iterations (with replacement), enabling robust estimation of confidence intervals for key parameters such as the Cobb angle and reducing bias inherent to small sample sizes. Second, to account for ambiguous statistical distributions, the analysis initially assumed normality, validated via the Shapiro-Wilk test (α=0.05) and Q plots; where deviations occurred, non-parametric tests (e.g., Mann-Whitney U) were employed to preserve analytical validity. Finally, to streamline computational complexity, calculations relied on validated libraries such as SciPy (Python) and the MATLAB Statistics Toolbox, ensuring result precision and reproducibility. All code was peer-reviewed by biomedical programming experts to align with scientific standards.
Testing the SAA Application Using Monte Carlo Simulation
The Monte Carlo simulation is a robust statistical method that uses random sampling to model uncertainty and variability in complex systems, making it suitable for testing the reliability and performance of the SAA’s diagnostic and recommendation algorithms under diverse scenarios. The document provides data on a cohort of 450 adolescents with idiopathic scoliosis (Cobb angles 10°–45°), collected from 15 international centers, with key metrics such as Cobb angle measurements, Risser sign results, and treatment recommendations (e.g., periodic monitoring for 10° ≤ θ Cobb ≤ 25°, bracing for θ Cobb > 25°). The Monte Carlo approach will simulate variations in input parameters (e.g., Cobb angle, age, sex) to assess the app’s robustness and accuracy across a range of clinical scenarios.
Monte Carlo Simulation Design
Objective
Evaluate the SAA’s diagnostic concordance and treatment recommendation consistency under variable input conditions, accounting for potential measurement errors (5–8% for Cobb angle, as noted in the document).
Input Parameters
Cobb Angle (θ Cobb): Sampled from a normal distribution with mean = 27.5° (midpoint of 10°–45°) and standard deviation = 5°, reflecting the cohort’s range and reported measurement error.
Age: Uniform distribution between 10 and 18 years, as per the study’s inclusion criteria.
Sex: Binary variable (male/female), with probabilities based on cohort demographics (assume 70% female, which is typical for idiopathic scoliosis).
Risser Sign: Discrete distribution (0–5), weighted based on typical adolescent scoliosis progression patterns (e.g., 30% Risser 0–1, 40% Risser 2–3, 30% Risser 4–5).
Simulation Steps
Generate 10,000 synthetic cases using random sampling from the defined distributions. Input each case into the SAA’s decision-tree algorithm to obtain treatment recommendations (monitoring, bracing, or surgical referral). Compare SAA outputs against SOSORT guideline-based recommendations, calculating concordance rates and error frequencies. Assess sensitivity to input errors by introducing noise (e.g., ±5–8% error in Cobb angle) in a subset of simulations.
Output Metrics
Concordance Rate: Proportion of SAA recommendations matching SOSORT guidelines (target: ≥96.7%, as reported in the document).
Error Rate: Frequency of incorrect recommendations due to input variability.
Confidence Intervals: Use bootstrap resampling (B = 1000 iterations, as in the document) to estimate 95% CIs for concordance and error rates.
Implementation
Use Python with libraries like NumPy for random sampling, SciPy for statistical analysis, and Pandas for data handling, aligning with the document’s mention of validated computational tools (SciPy, MATLAB). Validate results against the document’s reported metrics (e.g., κ = 0.89, χ² = 12.45, p < 0.001).
Expected Outcomes
The simulation will quantify the SAA’s robustness to input variability, particularly measurement errors, which are a noted limitation (5–8% error risk for Cobb angle). High concordance rates (>95%) would confirm the app’s reliability, while error analysis will highlight scenarios requiring algorithm refinement (e.g., edge cases near θ Cobb = 25°). The results will inform future improvements, such as automated input validation to mitigate manual errors. The addition of a Monte Carlo simulation enhances the methodological rigor of the study by providing a computational approach to testing the SAA’s performance under uncertainty, a critical consideration given the documented reliance on manual inputs and associated error risks (5–8% for Cobb angle). This method aligns with the study’s emphasis on robust statistical techniques (e.g., Bootstrap, Shapiro-Wilk) and its use of validated computational tools (SciPy, MATLAB). By simulating a large number of cases (10,000), the approach accounts for variability in clinical inputs, which is particularly relevant in diverse settings like those in Syria and Egypt, where measurement precision may vary. The paragraph integrates seamlessly with the existing statistical analysis framework, reinforcing the study’s commitment to transparency and reproducibility. It also addresses a key limitation (manual input errors) by proactively testing the app’s resilience, thus strengthening the scientific foundation for its global applicability. Citing reference [14] maintains consistency with the document’s referencing style and links the addition to prior work on mobile health applications.
RESULTS
This section presents the findings from the evaluation of the Scoliosis Assessment Aid (SAA), offering a detailed analysis of its performance in supporting scoliosis assessment and clinical decision-making. Derived from a clinical trial involving 450 scoliosis cases and 220 medical professionals in 15 international centers, the results highlight SAA’s concordance with SOSORT guidelines, diagnostic precision, and operational efficiency. Statistical analyses, including Chi-Square tests, Cohen’s Kappa, and Analysis of Variance (ANOVA), provide quantitative evidence of the application’s reliability and therapeutic consistency. Furthermore, data from over 4,000 uses in 18 countries illustrate SAA’s global reach and its practical impact on clinical workflows. The following subsections systematically present these outcomes, supported by tabular data and interpretive commentary to situate the findings within the broader landscape of scoliosis management.
Potential Impact on Scoliosis Management
The SAA standardizes evaluations and supports evidence-based decision-making, reducing variability in care and improving patient outcomes [10]. Adherence to SOSORT guidelines minimizes disparities in treatment approaches, which may enhance clinical efficacy [11]. The app also reduces the need for frequent clinic visits by enabling preliminary remote assessments.
Furthermore, SAA promotes patient and family education, fostering informed decision-making and improving treatment adherence [12].
Clinical Outcomes and Statistical Evaluation of the SAA
The “Scoliosis Assessment Aid (SAA)” underwent a clinical trial involving 220 medical professionals (120 orthopedists, 100 physiotherapists) from 15 international medical centers to assess its compliance with SOSORT 2011 guidelines (Tables 6, 7, 8, and 9). The sample included 450 scoliosis cases (ages 10–18, Cobb angles 10°–45°).
The high statistical significance (p < 0.001) confirms the strong alignment between the SAA recommendations and the SOSORT-guided clinical evaluations.
κ = 0.89 indicates “almost perfect” agreement (Landis & Koch scale) between the SAA and the specialists’ therapeutic decisions.There was no significant difference (F = 1.32, p = 0.25) in diagnostic accuracy between the SAA-assisted group and the control group.
92% of clinicians endorsed SAA as an effective clinical decision-support tool. To evaluate the effectiveness of the Scoliosis Assessment Aid (SAA) in diverse clinical settings, the application was tested in trials involving 220 medical professionals from 15 international medical centers, with over 4000 uses recorded in 18 countries, including Syria, Egypt, the United States, Italy, Poland, Algeria, and Albania, from January 2023 to April 2025. The trials encompassed a wide range of cases (ages 10-18, Cobb angles 10°-45°), allowing the application to be assessed in varied contexts, including public hospitals and specialized centers in resource-limited regions. Data were systematically collected to analyze the application’s reliability in supporting clinical decisions, with a focus on its concordance with specialist evaluations. The 4000 uses of the application demonstrated an improvement in clinical efficiency, reducing the average time required for initial decision-making by 15% (from 12 minutes to 10 minutes on average) according to reports from 87% of specialists in resource-limited regions. These usage statistics were derived from aggregated application analytics and participant surveys conducted in the 15 international centers. The application facilitated standardized assessments, particularly in areas lacking advanced measurement tools. These data were gathered through participant surveys, with statistical analysis performed to ensure accuracy.
DISCUSSION
The findings from the evaluation of the Scoliosis Assessment Aid (SAA) underscore its potential as a leading digital platform for standardizing scoliosis assessments and enhancing clinical decision-making, particularly in accordance with the guidelines of the International Scientific Society on Scoliosis Orthopaedic and Rehabilitation Treatment (SOSORT). The high concordance of the application’s recommendations with specialist evaluations (κ = 0.89, p < 0.001) and a 15% reduction in clinical decision-making time reflect SAA’s capacity to streamline diagnostic processes without compromising accuracy, a critical advantage in resource-limited settings where advanced measurement tools are scarce. However, reliance on manual inputs poses a potential limitation, as errors in Cobb angle measurement or interpretation of the Adams Forward Bend Test by non-specialists may lead to inaccurate recommendations, highlighting the need for user training and automated validation in future iterations. Moreover, the application’s global adoption in 18 countries, with 4,000 documented uses, indicates its adaptability to diverse cultural and clinical contexts, this global reach is corroborated by usage data from clinical trial logs, which highlight consistent adoption in both high- and low-resource settings [18]. Yet longitudinal studies are warranted to assess its impact on clinical outcomes such as curve progression and treatment adherence. Compared to tools like the Scoliometer app, SAA offers a competitive edge through evidence-based treatment recommendations and educational features. The practical implications of SAA’s adoption extend beyond its high concordance with SOSORT guidelines, offering tangible benefits in clinical workflows and patient empowerment. In resource-limited settings, where access to radiographic equipment or specialists is scarce, SAA’s ability to provide preliminary assessments using manual inputs is transformative, enabling earlier interventions. However, its reliance on accurate Cobb angle measurements underscores the need for clinician training to minimize errors, particularly in primary care settings where expertise may vary. The app’s global reach, with significant usage in countries like Syria and Egypt, highlights its adaptability to diverse healthcare systems, yet regional disparities in training and infrastructure pose challenges to uniform accuracy. Integrating automated validation tools, such as image recognition for radiographs, could further enhance reliability. Additionally, SAA’s patient education features foster shared decision-making, improving treatment adherence, particularly among adolescents. These strengths position SAA as a versatile tool, but addressing training gaps and expanding language support will be critical to maximizing its global impact and ensuring equitable access to quality scoliosis care [14]. Based on the comparison of the current Scoliosis Assessment Aid (SAA) study with the content and information of the referenced studies, the analysis highlights SAA’s superiority in standardizing clinical assessments and achieving high concordance with SOSORT guidelines, while identifying areas for improvement that align with modern trends in digital tools for scoliosis management. The SAA study was compared with five recent peer-reviewed studies (2021–2025) to elucidate its contributions, strengths, and limitations within the field of digital health tools for spinal deformity assessment. First, Haig and Negrini (2021) conducted a narrative review of digital tools in scoliosis management, emphasizing the role of mobile applications in enhancing screening accessibility, but noting their limited integration with predictive analytics [9]. Unlike the SAA, which provides evidence-based treatment recommendations aligned with the SOSORT 2011 guidelines, their review highlights a gap in standardized outputs, positioning the SAA as a more structured tool. Second, Zhang et al. (2022) performed a systematic review of AI applications in scoliosis, reporting high accuracy (up to 95%) in automated Cobb angle measurements but limited real-world clinical integration [12]. SAA, despite relying on manual inputs, achieves a comparable 96.7% accuracy with practical applicability in 18 countries, though it lacks AI-driven automation. Third, Negrini et al. (2023) explored digital innovations in scoliosis care, identifying scalability as a key advantage, but noting challenges in multilingual support and user training [13], areas where SAA plans future enhancements. Fourth, Lee and Kim (2024) systematically reviewed mobile health applications for spinal deformities, reporting a concordance rate of 85–90% with clinical evaluations, lower than SAA’s κ = 0.89, underscoring SAA’s superior alignment with specialist decisions [14]. Finally, Patel et al. (2025) investigated mobile applications for spinal deformity assessment, highlighting error rates (6–10%) due to manual inputs, similar to SAA’s 5–8%, but lacking SAA’s robust statistical validation via Bootstrap and Shapiro-Wilk tests [15]. Collectively, SAA distinguishes itself through its high concordance, global scalability, and adherence to SOSORT guidelines, though its manual input dependency and English-only interface suggest alignment with challenges noted in these studies. Future iterations incorporating AI and multilingual support could further elevate SAA’s impact, aligning with the trends identified in these studies. The Scoliosis Assessment Aid (SAA) distinguishes itself among digital tools for scoliosis management through a comprehensive approach, integrating Cobb angle measurement, Adams Forward Bend Test, and Risser sign into a decision-tree algorithm aligned with 2011 SOSORT guidelines, which has been validated by a multicenter trial (96.7% concordance, κ = 0.89, p < 0.001) [14]. Compared to the Scoliometer, which offers a simpler interface for trunk rotation angle (ATR) measurement but lacks treatment recommendations or educational features, the SAA provides evidence-based guidance and has recorded 4,000 uses in 18 countries (2023–2025), enhancing efficiency by 7% in resource-limited regions [10]. These metrics have been validated through application usage logs and clinician feedback from the multicenter trial, confirming the SAA’s impact in diverse settings [18]. The Spine Screen, a non-invasive motion-based tool, achieves 88% ± 4% accuracy for detecting trunk asymmetry, but it falls short of the SAA’s robustness, offering no treatment plans. The Scoliosis Tele-Screening Test (STS-Test), designed for home use with illustrative charts, has lower accuracy (50% for lumbar curves) and limited compliance (38%), making it less reliable for clinical application. While the SAA’s reliance on manual input (5–8% error risk) and English-only interface pose challenges, its planned AI integration and multilingual support position it as a leader, surpassing the limited development prospects of its peers.
Detailed Comparison of Scoliosis Assessment APP (See supplementary materials).
LIMITATIONS
Despite the benefits of the Scoliosis Assessment Aid (SAA) app in improving scoliosis management, there are potential challenges that require consideration. First, the app relies on user input and is not recommended for use by non-specialists who lack accuracy in measurements or interpretation. This could lead to misdiagnoses or inappropriate recommendations, especially if the app is relied upon as a complete substitute for medical advice. Second, the app may not take into account additional clinical factors (such as general health status or family history) that influence the treatment plan, limiting the comprehensiveness of the assessment. Finally, the app’s guidelines warn against overreliance on the app without regular follow-up with a specialist, as this could delay necessary interventions in advanced cases.
FUTURE DEVELOPMENTS
Future iterations of SAA could integrate artificial intelligence (AI) and machine learning to predict curve progression using patient-specific data [13]. Additional features, such as compliance tracking and personalized rehabilitation exercises, could transform the app into a holistic scoliosis management platform [16].
CONCLUSION
The “Scoliosis Assessment Aid” represents a significant advancement in digital healthcare, offering an accessible and reliable tool for scoliosis evaluation. By enhancing diagnostic accuracy and therapeutic planning, SAA has the potential to improve global patient outcomes and quality of life. The Scoliosis Assessment Aid (SAA) stands as a pivotal tool in advancing scoliosis care, offering a scalable solution that aligns with global clinical standards. Its ability to standardize assessments and reduce decision-making time enhances its utility in diverse healthcare settings. By empowering clinicians and patients, the SAA fosters proactive management, particularly in resource-constrained regions [17]. Future enhancements, including AI integration and multilingual support, promise to further elevate its impact, potentially transforming scoliosis care globally. As digital health continues to evolve, the SAA exemplifies how technology can improve diagnostic precision, patient outcomes, and quality of life, setting a benchmark for future innovations in spinal deformity management.
While politics dominated the Syrian scene for the past 14 years, Syrian researchers and university professors are looking forward for a new era, where science prevails, after decades of ignoring the importance of this sector in the past, and the only few and insufficient steps made by the previous Syrian administration to promote education and scientific research. Ambitious Syrian under- and post-graduate students deserve the best education and best curricula. Nonetheless, Syrian researchers at universities and research centers also deserve continuous support from the new administration, enhancing the enabling environment, including appropriate infrastructure and enabling legislative laws. Nonetheless, it is also a mandate of expatriate Syrian researchers to give a hand, especially when all obstacles are removed. Indeed, at least hundreds of Syrian PhD holders fled Syria in the past two decades and never went back home due to many reasons, and now it is an opportunity for them to help and contribute to improving education at Syrian universities as well as supporting the research infrastructure. This was the main aim of the higher commission for scientific research (HCSR) for organizing six Syrian expatriate researchers annual conferences in the past years. We do hope that the new Syrian administration would benefit from, and build upon, the highly professional HCSR experience, communicating with Syrian researchers abroad and paving the way for them to maximize the benefit of their thrilling efforts for promoting scientific research in the new Syria.
Plastic pollution has become one of the most pressing environmental issues, as the rapid increase in the production of single-use plastic products overwhelms the world’s ability to deal with it. Consequently, there has been a relentless research effort to create environmentally friendly alternatives to plastics that support the circular bioeconomy, reducing waste and reducing carbon footprint for a more sustainable future. Cyanobacteria, also known as blue-green algae show promising ability to produce citramalate, a key component of sustainable plastics such as transparent plastic. These photosynthetic microorganisms show the ability to use sunlight to convert carbon dioxide, a major greenhouse gas, into useful organic materials. Notably, they offer a sustainable alternative to conventional methods by producing valuable products from carbon dioxide. Citramalate is produced in a single enzymatic step by combining two metabolites: pyruvate and acetyl-CoA. By carefully adjusting factors such as light intensity, carbon dioxide levels, and nutrient availability, the researchers achieved a dramatic 23-fold increase in citramalate production by optimizing key process parameters. Initially, the researchers produced only small amounts of citramalate, and later used a systematic “design of experiment” approach to investigate how the different factors interacted. This led to increased citramalate production. This technique could be used to create other environmentally friendly materials. This is because pyruvate and acetyl-CoA are also used to create many other important biomolecules, such as biofuels and pharmaceuticals.
Volcanoes are powerful and most of their activities that occur before an eruption occur below the surface. Volcanic gases, such as carbon dioxide, are often released as magma pushes upward, so They can be difficult to detect. Researchers have been looking for additional ways to detect signs of unrest before an eruption occurs, especially for volcanoes far from populated areas. Now they are turning to plant life for clues and signs that a volcano is about to erupt. The research is revealing how plants respond to changes in the ground beneath them, which could indicate rising magma and increased risk of volcanic activity or an impending eruption. Plants adjust the way they grow when their environment changes. This includes changes in photosynthesis and leaf structure patterns. Changes in carbon dioxide, sulfur and soil temperature can also affect how trees thrive, and these factors are often seen in volcanic environments. Studies in volcanic regions suggest that small bursts of carbon dioxide and hot fluids can initially fertilize native plants. But higher concentrations of harmful gases or superheated soil can stress them out and even kill them. Small changes in vegetation can be difficult to see with the naked eye. Forests are vast, and weather can alter growth in ways that have nothing to do with volcanoes. That’s why satellite imaging has become so popular. It can scan large areas and detect subtle hints of stress or unusual greening that might not be apparent on the ground. And it provides broad coverage without requiring a permanent presence in difficult locations. These findings illustrate how forests can provide valuable clues to scientists.
Journal:Syrian Journal for Science and Innovation Abbreviation: SJSI Publisher: Higher Commission for Scientific Research Address of Publisher: Syria – Damascus – Seven Square ISSN – Online: 2959-8591 Publishing Frequency: Quartal Launched Year: 2023 This journal is licensed under a:Creative Commons Attribution 4.0 International License.