Sickle Cell Anemia Project (SCA)

This project aims to develop an environment to analyze the data of Sickle Cell Anemia (SCA). This environment is intended to extract information from a semi-automatic set of papers about this disease and to implement algorithms from data mining to identify patterns that indicate interesting relationship, which is unknown so far and/or to predict future events in function of past data. Relationships can be, for instance, cause-effect for this disease.

The environment is composed of two main systems: DORS-SCA (Data Organizing and Recovering System for Sickle Cell Anemia) and DSS- SCA (Decision Support System for Sickle Cell Anemia). The first one aims to extract relevant information from scientific articles written in English about the sickle cell anemia and to store them in a database. The second one aims to identify patterns and/or to support the prediction of future events by applying the data warehouse and data mining techniques.



The Sickle Cell Anemia Project, including Professors, Researchers, PhD, MSc and Undergraduate students, generally meets once a weeka. The scheduled meetings can be seen in "Meeting".

Overview of the Sickle Cell Anemia

Sickle Cell Anemia (SCA) is an inherited (i.e., genetic) and hematological disease (i.e., from blood) that causes chronic destruction of red blood cells, episodes of intense pain, susceptibility to infections and, in some cases, premature death. It affects mainly the African descendants . Genes are inherited from parents, so this sickness is not contagious. Unlike the common anemia that can be cured with food that contains iron, vitamin B12 or vitamin C, the SCA neither has cure, nor can it be alleviated with food. However, it is a treatable disease and the patient can participate in the labor market, once he is receiving adequate medical treatment and has consistent responsibilities with his limitations and potential.

The SCA emerged in countries in central-west Africa, India and East Asia, about 50 to 100 thousand years ago, between the Paleolithic and Mesolithic periods. Paradoxically, it emerged as a human body self protection from malaria, which is common in regions of warm climate. The disease, which is passed from father to son, spread all over the world as a result of the migration processes, the colonization, and especially, the racial miscegenation (Figure 1). It is more frequent among people whose ancestors came from Africa, the Mediterranean countries (such as Greece, Turkey and Italy), the Arabian Peninsula, India and regions of Spanish colonization in South America, Central America and parts of the Caribbean.

Figure 1 – Racial miscegenation.

Sickle Cell Disease (SCD) affects millions of people worldwide. SCD is the most common hereditary blood disease in the United States, affecting 70,000 to 80,000 North Americans. An estimation of SCD carriers in the United States is 1 in every 500 North American African descendants and 1 in 1,000 - 1,400 North American Hispanic descendants.

In Brazil, SCA is also the most common inherited disease and it is considered a public health problem. Besides, SCA was introduced by the slave trade which started in 1550. First, the slaves worked in the industry of sugar cane in the Northeast and later, in the gold mining and precious metal extraction in Minas Gerais. Not surprisingly, today, Bahia is the state that has the highest focus of the disease in Brazil.

The data of National Newborn Screening Program (Programa Nacional de Triagem Neonatal, PNTN) show that every year 3,500 children are born with Sickle Cell Disease and 200,000 with sickle cell trait. Generally, twenty percent of these children will not reach the age of five due to complications directly related to the disease. The infant mortality rate in children without any treatment is 25%. Rio de Janeiro Blood Center Data show a reduction of around 2.4% when the patient receives full attention in the treatment. Therefore, proper treatment has a fundamental role in reducing morbidity and mortality of these patients.

Although SCA is most common in African descendants, any person may have the disease or carry the trait simply due to racial miscegenation. According to PNTN, the highest incidence either of SCA or sickle cell trait is found in the states of Bahia, Rio de Janeiro and Minas Gerais (Figure 2). In Figure 2, the S gene indicates the incidence of the disease or sickle cell trait.

Figure 2 - The S gene frequency in different areas from Brazil.


More information about Sickle Cell Anemia, please see the Technical Report "Sickle Cell Anemia".