Syntagmatic Modelling: Proposal for Indexical Representation of Evolutions of Notes in Patient’s Records

The aim of this paper is to propose a model indexical representation, using the syntagms referring to the names of diseases, symptoms and signs recorded in the evolution of notes in patient’s record, aiming the organization and access to information of these documents with greater precision. Qualitative research based on Content Analysis (CA), was carried on having as its locus the Walter Cantidio University Hospital, Federal University of Ceará, Fortaleza, Brazil. The corpora consist of twenty evolutions of two Patient’s Records, totaling eight volumes. Evolution was read and the syntagms referring to diseases, symptoms and signs were selected, and the index was organized with the mapped syntagms. Findings: 1200 syntagms referring to the names of diseases, symptoms and signs are presented according to the indexes, considering that the syntagms accurately state the particularity of the information recorded in the evolution of notes in patient’s record and may contribute to the access and retrieval of information with better precision in the Medical and Statistics Archives Services (MSAS). This paper shows that the use of syntagms make possible to offer access and retrieval of information with higher added value in the context of the evolution of notes in patient’s record, since they express the substance of what they name. We consider that a structured information retrieval system is highly relevant for Medical and Statistics Archives Services (MSAS). The syntagms bring contributions in understanding the planning of patient care actions and are a tool for researchers in context evaluation of the diseases treated in the health organizations. We infer that just as in any other domain of knowledge, in the context of this research, such modeling should also observe the stages of definition of the lexicon and base symbols, definition of a grammar to be adopted and definition of semantics that will at the end contribute to the construction of indexes, in this case, constituted by the syntagms. This fact is effective as a paradigm shift in the use of uniterms or concepts composed of a maximum of three lexical units for the indexical representation of health documentation, such as Patient’s Records. Finally, the data from this research can be reused for other qualitative studies allowing other perceptions regarding the use of syntagms for indexical representation of MSAS and contribute to more specific evidence-based medicine studies and also to other research themes: patient speech, therapy and exams.


Introduction
The indexical representation, indexing or thematic representation of information (manual, semi-automatic or automatic) of traditional scientific, technological and literary documents is a complex task that uses words or concepts extracted directly from these documents in Natural Language (NL) or terms (descriptors) captured from Documentary Languages (DL) and/or assigned by human or "machinic" indexers. In these schemes uniterms or words usually composed of two or, at most three lexical units, are adopted. This way of establishing such representation brings innumerable problems for information retrieval (IR), because words have many ambiguities and even those composed of two lexical units can also be ambiguous or bring polysemy's and, consequently, interferences with IR.
For example, in healthcare when we speak of "pressure" it can be understood in various contexts: ophthalmology, cardiology, psychiatry etc. Therefore, in each of these specialties, it is necessary to adopt the predication, so that there is no interference in the communication process within the multidisciplinary health team and between the helf thing and patients. Thus, in ophthalmology it would be "eye pressure" or "ocular pressure"; cardiology refers to "blood pressure", in psychiatry it means "social pressure" or "psychological pressure". Another example is the term "kidney disease" whose meaning is too general to specify particularities of this disease. Thus, in order for the physician to individualize pathological conditions, it is necessary to typify which denomination seems more adequate: "Chronic Kidney Disease", "Acute Kidney Failure", "Acute Kidney Failure with Tubular Necrosis", "Acute Kidney Failure with Medullary Necrosis". These examples ratify the so-called complexity of the one evoked by Deleuze and Guattari [1] that "[…] The one is said in one and the same sense of the whole multiple, the Being is said in one and the same sense of everything that differs (...)." In reality, "We are not speaking here of the unity of substance, but of the infinity of the modifications that are part of each other on this one and the same plane of life". Therefore, when thinking about indexical representation within the information systems framework for the Medical and Statistical Archives Services (MSAS), whose collection consists of the Patient's Records, all these aspects must be taken into account.
Because of all these characteristics the indexical representation of these documents must be increasingly understood in a complex and multidisciplinary perspective, especially in dialogue with the philosophy of language, terminology and linguistics. Reports from the Text REtrieval Conference [2] argue that indexing with word groups enables better information retrieval (IR). Similarly, indexing by complex terms or syntagms favors the identification of semantic entities and, consequently, better representation of the themes dealt within the documents [3]. Is in this context that syntagms are adopted for the indexical representation of various types of documents aiming to offer a better quality of information retrieval, as shown by the research developed by Widdows, et al [4] and Mitra, al [5], among others. In the context of indexical representation of patient records, in the literature review of by Hersh, and Hoyt [6] shows that there is on the concerning this documentation, fact that we confirm the also confirmed in the present study.
It is, therefore, from this perspective that the research was developed from the following problem: how to apply the syntagmatic modeling in the context of indexical representation in the evolution of notes in the patient's record?
According to Paquette [7], to structure the representational modeling, it is necessary to observe the following steps: a) Definition of the lexicon and the base symbols that will be used in the representation. Example: Abd: Ascitic Globose; RHA +; b) Definition of a grammar that will describe the set of accepted syntagms by combining them with the symbols. Example: Esaf: ppp. Edema (4 + / 4 +) in the upper limbs; c) Definition of semantics that will associate a mental and intelligible representation by coupling one or more expressions of the grammar of knowledge as part of the mental model of the person who expresses himself through language. Example: "If fever is maintained and no infection point of departure is identified, an indication to remove cardiovascular catheter (CVC) and to request bacteriological identification is placed".
Thus, the purpose of the research is to propose a model of indexical representation, using the syntagms adopted to record diseases names, names of symptoms and signs in the evolution notes on Patient's Records, as a new paradigm for the representation, organization and retrieval of information of these documents with greater precision.
The Resolution No. 1,638/2002 [8], of the Brazilian Federal Council of Medicine of (CFM), in its Article 1, defines the patient's medical record as "a single document consisting of a set of information, signs and images recorded, generated from facts, events and situations regarding the patient's health and the care provided, of legal, confidential and scientific nature [...]". Item I, point c. Article 5, states that evolution is a document in which "date and time, a breakdown of all the procedures to which the patient is submitted and the identification of the professionals who performed. It should be electronically signed when prepared and/or electronically stored" [8]. Analyzing Patient's Records, Bentes Pinto [9] considers that, like any other document, they have physical and logical structures. At first, there are the macrostructures or descriptive features -à la Gardin [10] -which present the global meaning of the constitutive data of the Patient's Records -patient identification, specialized ambulatory registration -anamnesis (main complaint, investigation of symptoms and signs, occupational data, personal and family habits/antecedents) and whole body physical examination; request for authorization for hospitalization (identification of the health establishment patient identification, procedures requested in case of external causes -accidents or violence); notes, blood or ogan donor request, requests for second opinion; discharge summary -summary of clinical history and physical examination, results of main complementary exams, patient outcome and complications, therapy, definitive diagnosis, patient medical advice. Discharge for outpatient care -type of care, procedures performed, results of care. This information is recorded as outpatient evolution form, written by professionals involved with patient care, containing patient or guardian's signature, stamp/signature of professionals, etc. The second structure, the logic or microstructure, refers to the basic information recorded in the chart according to a default form, with pre-defined structured format which must be completed. [9] To answer the research question, this study focuses on indexical representation adopting the syntagms present in the evolution notes in Patient's Records related to the disease, symptoms and signs. Therefore, the contributions of this article are, among others: a) to propose a new methodology of indexical representation based on syntagmatic modeling of evolutions of notes in the analogic Patient's Records (manuscript); b) a proposal to establish a new technique for the indexical representation of these documents, based on the noun, verbal, prepositional, adjectival and adverbial syntagms present in the evolutions notes in Patient's Records; c) to inspire and motivate the development of other research studies covering the theme presented here, thus raising other perspectives in this domain. The article is organized as follows: this section introduces the study proposal, followed by a conceptual view of syntagms, uses, types, and applicability in research on Patient's Records. The methodology of the empirical study done at the Medical and Statistics Archives Services (MSAS) of the Walter Cantidio University Hospital-Federal University of Ceará is then described The study data collection consisted of a corpora of two (2) Patient's Records and emphasis was given to the analysis of the physical structure called evolution in order to identify the presence of the noun, verbal, adjectival, prepositional and adverbial syntagms in the writing of evolution notes. The corpora consisted of twenty evolutions of two Patient's Records, totaling eight volumes. Identifying in the evolution of notes in patient's record the dynamics of the syntagms referring to diseases, symptoms and signs they were then selected, and the index was organized with the mapped syntagms. As a result, 1200 syntagms were mapped referring to the names of diseases, symptoms and signs, and the indexes were organized by the set of mapped syntagms. It is concluded that the syntagms can contribute to the access and retrieval of information with greater precision in the Medical and Statistics Archives Services (MSAS). Therefore, we believe that this research evidences paradigm changes in comparison to the existing models of indexical representation.

Representation of Patient's Records
Reflections around syntagmatic modeling have brought paradigm shifts not only for studies of terminology and linguistics, but also for the field of information science, especially within the scope of indexical representation. This is noticeable because the syntagms composed of more than one lexical unit bring in its essence the individualization or particularity of a subject and consequently its property.
According to Ferdinand de Saussure "in discourse terms establish among themselves, by virtue of their chain, relationships based on the linear character of language, which excludes the possibility of pronouncing two elements at the same time" [11]. These combinations "which rely on extension, may be called syntagms" and consist of "always two or more consecutive units." [11] Thus, "Put in a phrase, a term only acquires its value because it opposes the preceding or following, or both." [11]. Saussure also says that "[...] the linguistic sign joins not a thing and a word, but a concept and an acoustic image". For him the terms implied in the sign (the concept and the acoustic image) "[...] are united in the brain by an associative bond" [11]. Sautchuk [12] reflections ratify Saussure's thinking that "nothing in language works alone", language units must be associated at least in two units so that they can function. If we take "pain" as an example to refer to a symptom, the term will only have semantic value if it is preceded by some other: headache, body ache, headache, pain in the left leg. And in healthcare, it makes a huge difference. For Bally "every syntagm is […] the product of a relationship of grammatical interdependence established between two lexical signs belonging to two complementary categories of one another." [13]. In the investigated literature, several types of syntagms were found and for this study we highlight: a) Nominal Syntagm (NS) has the name as its central core. According to Azeredo, "we use nominal syntagms to designate portions of our world experience conceived as real or imagined, natural or cultural, unique or generic, concrete or abstract units" [14]. He further states that "The selection of the forming elements of the nominal syntagm thus obeys the need to make the referenced content accessible through it to the interlocutor. We call this the referencing" [14]. In the context of Librarianship and Information Science, Kuramoto [15] says that "The nominal syntagm is the smallest part of the information-carrying discourse." For example: asthma, myeloma. b) Adjectival Syntagm (AdjS.) has an adjective as a nuclear element serving as a name modifier. It may also be accompanied by a verb and/or entered by a preposition. According to Perini [16], this adjective may have predicative function, complement, adjunct or affix. In the constitution of certain sentences, there is what is known as the adverbial specifier, for example: good general condition. c) Verbal Syntagm (VS) has a verb or verbal syntagm as the nucleus of the sentence, and keeps the predicating function of the sentence, that is, enables the entities referred to by the noun syntagms to become, within the sentence, "the subject of any comment that is subject to the temporality that characterizes the sentence and enables the expression of the dynamics of events and the flow of life" [14]. For example: the joint involvement of the pathology persists. d) Adverbial Syntagm (AdvS.) has an adverb as a nucleus. May eventually be accompanied by adverbial or prepositional expressions. [17]. It can also explain a noun. For example: extremely agitated. e) Prepositional Syntagm (PS) has a preposition or prepositive syntagm to compose its core. For Vigier [18] "evidences a frontal position, reference to a domain of activity, regulates the simple preposition immediately after a possibly modified name". It never appears in the reduced form, being mandatory the use of the complement or adjunct. For example: Rotator cuff injury on the right.
The syntagms are already being proposed as metadata for the indexical representation aiming to obtain better information retrieval, particularly in scientific and literary documents, especially in the context of Documentary Languages (DL) which are structured with terms called descriptors. Le Guern [19] argues that the "descriptor is particularly a nominal phrase". And, all of them present in a text to index are considered descriptors in the information systems. In the indexical representation and organization of Patient's Records, the literature shows that there are already Representation of Evolutions of Notes in Patient's Records some works proposing the adoption of syntagms. A publication by Charlet et. al. [20] shows the importance of the contribution of terminology and linguistics to indexing and retrieving information from Patient's Records using syntagms, since they are widely used in the writing of these documents precisely because they express the substance of what they state. For Aristotle apud Santos [21] "substance" is the traditional translation of a word that literally means "reality" or "entity." As stated by Aristotle [22] [24] also proposed a system that enables the use of various terminological languages to index high summaries, and the results of their investigation show that these languages contribute to more accurate indexing. Sibanda [25] developed a system for automatic semantic category recognition by adopting algorithms or detecting semantic relationships between concepts to extract key information from patient's record discharge summaries by automatically representing semantic categories for disease, symptoms and symptoms that highlight the semantic relations between the syntagms.
These initiatives include electronic Patient's Records, however, in the present study, records are analogical (handwritten) and no electronic system could be adopted for the proposal of syntagmatic modeling of these Patient's Records. This is why, since this modeling proposition is completely manual it can make very important contributions to indexical representation and, consequently, to coding and access to information in the Medical and Statistics Archives Services (MSAS).

Method and Data Description
In this study a qualitative research deals with the proposition of modeling from the perspective of indexical representation, adopting syntagms used in the annotations referring to the names of diseases, symptoms and signs, registered in the evolution of notes in Patient's Records, aiming to offer possibilities of information retrieval with better quality. The bibliographic survey did not identify studies addressing this theme in the context of evolutions notes in Patient's Records. Therefore, we consider it to be a pioneering study dedicated to this theme in the context of analogical or manuscript records, applicable in similar conditions of the research corpora.
In literature, studies and researches that take into account the complex linguistic units (syntagms) for the representation of Patient's Records were found precisely in patient discharge summaries, but not in evolution of notes. Patient's Records are inserted in the so-called health documentation and considered as a dossier, consisting of a set of documents characterized as forms. They contain all the information that expresses the health condition of a patient and the actions taken to restore it. Friedman [26]; Uzuner, et al. [27] teach us that the medical record is characterized as a collection of very heterogeneous documents regarding format, content and semantics.
One of the components of the patient's record is the evolution form, the wording of which details the patients state of health and its importance comes from Imhotep (2850 to 525 BC) and Hippocrates of the Kos (460 -370/377 BC). In 1960, Professor Lawrence L. Weed [28] developed a data and information evolution methodology for the medical record that was called SOAP (Subjectives, Objectives, Evaluation, and Action Plans) for Patient's Records. These records are registered by the multiprofessional health team, which includes doctors, nurses, nursing technicians, nutritionists, physiotherapists, etc. In the locus of our research, all health-related information is manually recorded, according to each fact observed, taking into account the four types of data and information proposed in the Weed model. Indeed, Weed's methodology has important features because of its "objectivity, organization, greater accessibility to information for decision-making, and the systematic description of the evidence and perceptions that underlie conclusions and diagnoses, and therapy during patient follow-up". [28] As the Evidence Based Medicine Working Group (EBMWG) understands, SOAP can make contributions to the practice of Evidence Based Medicine as it is possible to construct concept maps related to the patients health status and also to the actions taken for solutions. [29] The treatment of qualitative data is based on the proposal of Content Analysis (CA) by Laurence Bardin [30]. For her, this methodology is a protocol of "communications analysis aiming to obtain [...] indicators that allow the inference of knowledge related to the conditions of production/reception (inferred variables) of these messages. CA is effective by observing the following phases: pre-analysis, material exploration and treatment of results -inference and interpretation. According to Bardin [30], the pre-analysis makes a first contact with the documents that could be submitted for analysis. Thus, we read two analogical Patient's Records of the nephrology and rheumatology specialties in order to identify the evolutions. Reading of all evolutions of the two Patient's Records were analyzed in order to identify the syntagms present in this structure. After this first phase, we defined the corpora of this research that consisted of a total of twenty evolution notes.
The second phase consisted of the exploration of the material, adopting as codification procedure the five types of syntagms that we adopted for the research: Nominal Syntagma (NS); Adjectival Syntagma (AdjS.); Verbal Syntagma (VS); Adverbial Syntagma (AdvS.); Prepositional Syntagma (PS). The third phase was the treatment of the results obtained and interpretation. The various types of syntagms were then structured, according to the following categories: diseases, symptoms and signs.
The extraction of the syntagms was done manually, because, as previously mentioned, the Patient's Records studied are analogical, in manuscripts format, and it was not possible to find a software for automatic recognition of the diverse handwriting formats of the notes registered to constitute the evolutions of notes in patient's records. The extraction was made by the author observing the methodology of the lexical-semantic pattern of grammar rules, as they highlight the lexical and semantic context in which terms are entered and connected. This decision was taken in order to evaluate, within the notes in Patient's Records, the importance of each syntagm as a possibility of indexical representation of evolutions of notes in patient's records. This study followed the ethical and bioethical requests for the Information Sciences and Health Sciences areas. It was submitted to and approved by the University's Ethics Research Committee and included in Plataforma Brasil. Data remained anonymous and confidential.

Analysis of Data and Discussion of Results
In order to precisely analyze and capture data, we carefully read the complete set of evolution notes registered in the two Patient's Records, focusing on the identification the various items of the logical structure and on the apprehension of the peculiarities of evolution notes as registered in the Patient's Records. This detailed analysis led to a total of 1200 syntagms mapped. It is noteworthy that reading and rereading these documents was a time consuming task, since the notes were manuscript and a wide hand-writing variety was present, which required a lot of accuracy for the hand-writing or "orthographic" understanding and therefore, for the correct evidence of the syntagms. For this reason, we chose to transcribe the data as it was understood, and then to consult with specialized sources to ratify or rectify the findings. We also maintained all observations made in the Patient's Records, as we consider them to be normal codes used in the writing of these documents. Here are some examples in Figure 1: In possession of the mapped syntagms, we discussed with health specialists so as not to make mistakes in identifying what is effectively defined as disease, symptom and signs. According to the dictionary of the Real Academia Española (RAE), the word disease comes from Latin, infirmitas, and it refers to "pathological alteration of one or more organs, that gives rise to a set of characteristic symptoms". [31]. In turn, the symptoms are "the reports, the complaints, what the patient tells the doctor during the consultation. This is what the doctor listens to or asks the patient during the medical interview (anamnesis). It's a subjective complaint, what the person is feeling or felt." On the page of the Brazilian Medical Academy (ACM), we find that a signs is an "objective manifestation of a disease. Pathognomonic signal: unambiguous manifestation of a pathology." [32] In Table 1 some examples are given. Symptoms or signs informed by the patient were registered as informed, even if usually not present in the disease. As shown in Table 1, "darkened foamy urine" was informed by the patient and kept in the study records even though it is not a finding in isolated back pain. In the third phase, we identify, analyze and evidence the patterns (disease, signs and symptoms) in the mapped data set. In this phase, we reviewed the syntagms, which demanded special attention, as the writing of the Patient's Records brought us many misunderstandings. Then, at this stage, we proceeded to clear some speeches, checking into the specialized sources and discussing with our consultants (doctor and nurse) about our understanding of what had been noted. At various times there were difficulties, given that certain expressions could not be deciphered very well, and sometimes the correspondence of what had been recorded with the sources left doubts. Thus, we expose in Table 2 some types of syntagms that will be the basis for modeling the syntagmatic proposal of indexing. After this step, we characterize the syntagms throughout the dataset by identifying their semantic characteristics. As mentioned earlier, the data set studied consisted of twenty evolutions, and a total of 1200 syntagms were identified: Nominal, Verbal, Adjectival, Adverbial and Prepositional. Table 3 shows an excerpt. After identifying the syntagms, we structured the index with these concepts, presenting an excerpt in Table 4. For the example of the index shown here, we argue that the use of syntagms referring to the names of diseases, symptoms and signs can make more accurate contributions for information retrieval in MSAS. This is because they specify the substance of these concepts and consequently offer less noise and interference when seeking information in the context of the communication process related to the care of the sick person.  Through AC it was also possible to know the ways in which the discourses of each healthcare professional are recorded in the evolution of notes in the patient's record and they reflect the health status of patients, the actions of care performed and the translation of patients' complaints. Thus, we corroborated with Willig [33], stating that the ways of writing favor for individuals to create meanings in their experience.
These results come to meet Otlet [34], in his Documentation Treatise, by stating that "The purpose of organized documentation is to be able to provide documented information on any kind of fact and knowledge: 1° universal as to its object; 2° correct and true; 3° complete; 4° fast; 5° updated; 6° easy to obtain; 7° gathered in advance and prepared to be communicated; 8° made available to as many as possible". All these data enable the contextualization of knowledge in evolution notes in Patient's Records, according to a specific domain of use. Additionally, the result of this study also make evident the need and urgency for interdisciplinarity and interprofessional education and professional practice. Domains such as Library and Information Science, Health Professions, Linguistics and Terminology, Information and Communication Technologies can join efforts to develop initiatives related to innovations in Knowledge Organization Systems (KOS) and thus benefit and improve the quality of documentation in the health domain. The dialog between these areas have a potential to improve significantly access to and information retrieval with immediate applications in healthcare and also favoring communication within the healthcare team and with patients an their families. Fostering the development of such culture will create opportunities and bring new horizons for teaching, research, services and management. The adoption of syntagms in the indexical representation discloses the ontological essence of "being while being"and naturally, this approach is of great importance in information retrieval from Patient's Records.

Some Conclusions
In this study we propose a model of indexical representation, using the syntagms adopted in the regarding the names of diseases, symptoms and signs in the evolution of notes structure as part of Patient's Records, therefore a new paradigm for the representation, organization and retrieval of information from these documents with greater precision. Thus, when considering the context of indexical representation it is necessary to observe that the rules may vary according to the purpose of the indexing system created. For example, Computerized Medical Archives and medical publications have their own indexing rules, and the terminology used for indexing healthcare documents may differ (eg, MeSH, ICD-10, or SNOMED). However, regardless of the rule or documentary languages, we consider that in modeling the indexical representation of evolution of notes in the patient's record, the syntagms are key elements as they express the substance to which they refer, and, in the care actions of the patient, interference in the communication process makes all the difference.
Through the use of syntagms it is possible to offer access and retrieval of information with higher added value in the context of the evolution of notes in Patient's Records, since they express the substance of what they name. Because of these characteristics and of the relevance of writing evolutions of notes in Patient's Records, we believe that in the process of communication between the multiprofessional health team, these elements contribute to reduce interferences that will improve the patient quality of care. We know that most information retrieval systems adopt simple terms for indexical representation and retrieval of information or documents. However, our understanding is that if a way of representation is not necessarily effective to represent the content of traditional documents, such as Patient's Records, it will certainly bring much ambiguity, because names of diseases, symptoms and signs if are made up of more than one word, when separated lose their essence, as can be seen in acute myeloid leukemia and leukemia (LMA). Given this fact, we consider that in an information retrieval system for Medical and Statistical Archives Services (MSAS), syntagms will make contributions both in understanding the planning of patient care actions and in diseases research.
In regards to our research question " how to apply the syntagmatic modeling in the context of indexical representation in the evolution of notes in the patient's record structure? ", we infer that just as in another domain of knowledge such modeling should also observe the stages of definition of the lexicon and base symbols, definition of a grammar to be adopted and definition of a semantics as a final contribute to the construction of indexes, in this case, constituted by the syntagms. The proposed model is effective as a paradigm shift in the use of uniterms or concepts composed of a maximum of three lexical units for the indexical representation of health documentation, such as evolution of notes in Patient's Records.
Finally, we conclude that data from this research can be used in other qualitative studies allowing new perceptions regarding the use of syntagms for indexical representation of Patient's Records to emerge, thus contributing to more specific evidence-based medicine studies and also to several other research themes.