Collecting and using log data

Logging refers to the storage and utilisation of log data. Log data is used to determine what, why and when something happened. Log data is collected by operators and devices connected to the internet. This guideline is intended for organisations that have in-house information security expertise.

A log is a chronological recording of events, or transactions, and their causes. Transactions and changes in information systems, applications, data networks and data content are recorded in the log, that is, logged.

Log data is generated all the time, everywhere. Your computer maintains an access log, wireless network base stations and wired routers store transaction logs, your phone operator keeps a communication log, your daily software have access control logs, error logs and so on.

An adequate log contains

Step 1
Timestamp
The system logs the time of the transaction.
Step 2
Transaction and user
The system logs what was done or attempted and the user.
Step 3
Access right
The system logs which authorisation or access rights enabled the transaction.
Step 4
Transaction source
The system logs where the transaction was performed and where the change data originated.
Step 5
Transaction status
The system logs whether the user succeeded in the attempted action. If the action failed, it records the cause of the failure.

Ensure data protection

If a log contains personal data, its processing must comply with the EU General Data Protection Regulation (GDPR). According to the GDPR data minimisation principle, personal data must be limited to what is necessary for the purposes of processing the data. A privacy statement must be drawn up concerning the personal data in the system.

Excessively detailed monitoring of transactions combined to the identities of individuals may violate the individuals’ data protection.

As a rule, the following data should not be stored in a log:

Personal identity codes
Special categories of personal data within the meaning of the GDPR (‘sensitive data’)
Credit card numbers
Passwords (not even in encrypted form)
Intersystem access keys and secrets
Authorization data
Interpersonal communication content.

Logs can be classified based on their form as well as purposes and methods of use.

A maintenance log is used to maintain data on changes to the system’s operation and access rights as well as to manage error situations. This type of log is necessary for version management and for monitoring the overall architecture of the operating environment.

An access log, or an event log, is probably the most common and indispensable log type. It registers user logins and logouts as well as data on other normal processes performed by the system. The system’s modules leave a trace in the access log when calling other modules. Entries on printing and reading data content in a database are recorded in the access log.

Entries on any additions and changes to and deletion of data are recorded in a change log. When the source of the changed data appears in the log entries, its validity can be traced and verified, if necessary.

An error log is particularly necessary when solving problem situations. When the cause of an error is logged as accurately as possible, it is easier to fix the cause.

There are also other commonly used log formats. A communication log can contain data about conveyed communications: message source, endpoint and other data, such as time, quantity, unique identifier and status. Identification data is an example of communication log data. The most common e-mail servers are set up to maintain a communication log.

A holder log tells you who was the holder of an internet address, a telephone number, a domain, or a rental car at a specific point of time. Holder data can be connected directly to a person, an organisation or a system.

Log customisation

Step 1
How much is appropriate?
The easiest way to determine the appropriate amount of log data is to try it out. Start carefully and increase the detail of data collection as required. If excessive logging is enabled at the deployment stage, the system will quickly be overwhelmed, and filtering useful data among the vast mass becomes difficult. With regard to personal data, the GDPR minimisation principle requires that only such data whose processing is necessary is processed.
Step 2
Log retention period
Depending on the protected data, the sufficient retention period varies between 6 and 24 months. The GDPR does not specify precise retention periods for personal data. The controller must assess the retention period and necessity of personal data for the purpose of use in question. Personal data may only be retained for as long as it is necessary for its purpose of use. The controller must assess and be able to justify the retention periods.
Step 3
Archiving and erasure
By archiving log data, access is ensured to older observations that cannot be kept in an active log for reasons pertaining to the use of storage space, for example.

In principle, data must be erased or anonymised when their processing is no longer required. Log data usually has a life cycle at the end of which it is no longer of use, and it is not necessary to retain the data. A procedure should be in place for erasing log data.
Step 4
Responsibilities related to log processing
The main responsibility for the processing of logs falls with appointed administrators. To ensure traceability, individual administrators should not be authorised to edit the centralised log management system.

The log description specifies for which purposes log data is collected and for which purposes it may be used. The collection of each piece of log data must be justified. The security and appropriateness of logs must be audited on a regular basis. Administrators monitor the necessary logs, data security officers monitor logs related to data security, and ultimately, the organisation’s senior management is responsible for the log data.
Step 5
Passwords recorded at incorrect locations remain visible
A login log typically records both successful and failed login attempts. Login identification data includes, for example, the user ID, the time of the login attempt and the address used for the login attempt. Failed login attempts also leave an entry which can be viewed in the log. How often have you accidentally entered your password in the username bar and pressed enter? It is now stored in a log in a readable format, and it can be accessed by the administrator. You should always change your password if you accidentally entered it to a visible location.
Step 6
Ensure data protection
If a log contains personal data, its processing must comply with the GDPR. According to the GDPR data minimisation principle, personal data must be limited to what is necessary for the purposes of processing the data. A privacy statement must be drawn up concerning the personal data in the system.

Excessively detailed monitoring of transactions combined to the identities of individuals may violate the individuals’ data protection.
Step 7
Access control
The processing of logs requires thoroughly planned access control. Logs and their login services can best be protected against unauthorised use by using a log environment whose integrity has been secured and which is separated from the rest of the production environment.

In particular, the separation of the system from the rest of the environment should be ensured so that the integrity of log transactions cannot be affected by a security breach. In addition, log data should be backed up, entries should be recorded on the processing of log data, and alerts should be set up for unauthorised editing attempts.

Log processing must have a legal basis.

The legislation sets requirements for the content of logs, the retention period, the methods of ensuring the integrity of data and the purpose of the log.

Logs are needed, for example, to ensure the functionality of information systems, for compiling statistics on system use and to ensure data security. The collection and processing of the log must have a legal basis.

As there are various types of logs, their processing methods and rights depend on what type of data the log contains and for what purpose the log was originally collected.

Log requirements based on the legislation relate to the content, retention period, the methods of ensuring the integrity of data and the purposes for which the log data is used. In addition, logs must be processed in accordance with the organisation’s internal guidelines, if any.

Personal data makes the log a personal data file

If the log data contains personal data, the log is considered a personal data file. Personal data refers to any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person. In this context, particular attention must be paid to the obligations imposed by the GDPR and, for example, a statement must be drawn up on the processing of personal data.

When connected to log data, a name or an e-mail address, for example, are considered personal data. In some cases, an IP address may also be considered personal data.

If processing a log that contains electronic communication transfer data, the log data must be processed in compliance with Chapter 17 of the Act on Electronic Communications Services (917/2014). Transfer data includes, for example, data about the sender and recipient of an e-mail message, network addresses, the duration of the connection, routing, point of time, and the amount of data transferred.

If the log or the technical system that produces it is intended to be used for the supervision of personnel in order to, for example, protect business secrets or investigate cases of misuse, the so-called Lex Nokia sections of Chapter 18 of the Act on Electronic Communications Services shall apply to the use of the log. In that case, users must be informed of the procedure. In addition, the employer must organise a co-operation procedure.

Log management must be restricted so that all logs or the data contained in a single log are not accessible to everyone. Thus, processing authorisations should be limited on the basis of, for example, an employee’s information requirements based on their duties. This necessity requirement, or the so-called “need to know basis”, is recorded both in the principles of the GDPR and in the Act on Electronic Communications Services.

Logger’s check list

The correct construction of a log and the processing of log data require thorough planning, a clear process and unambiguous instructions. This is a short check list for loggers to facilitate successful log projects:

Consider the purpose of logging.
Is the data stored in the log necessary for the purpose of use?
Remember the legal obligations for different types of logs.
Process log data in compliance with predefined systems and procedures.
When the data is no longer needed, erase it.
Set access rights based on information requirements.
The data protection and legal protection of data subjects and the administration must be ensured.
Provide users with sufficient information, especially if the log is used for technical monitoring. Remember the co-operation procedure.

Log processing requirements are based on, for example, the following legislation

General Data Protection Regulation, GDPR Ulkoinen verkkopalvelu.

What is SIEM (Security Information and Event Management)?

In most cases SIEM is referred to as a data security product that combines information produced by other systems. However, this concept does not add as much value to the introduction of SIEM as SIEM can actually offer. Log management and SIEM should primarily be considered as a process or a method of managing data security that enables data collection from several systems that are not directly dependent on each other. The collected data is either stored or processed in a centralised location. This creates the opportunity to detect larger events consisting of several smaller transactions. The greatest strength of SIEM lies precisely in the opportunity to display and correlate data at a centralised hub.

The greatest misunderstanding concerning SIEM is the assumption that it is a finished product that provides added value through its mere existence. However, this rarely results in any remarkable added value or development of the situational picture.

Deployment of centralised log management

The most important thing is to first define a policy for managing logs. What is logged, how it is logged, and in particular, what the collected log is used for? One of the key issues in log management is the enormous number of transactions. It can quickly lead to a situation where system administrators can no longer efficiently find the required data among the logged mass. When introducing centralised log management, it should be specified precisely why the procedure is performed.

For example, the reason for deploying a centralised log management system can relate to requirements set by a regulation. In that case, it is not expedient to build an excessively heavy system if the organisation as a whole is not prepared to start using the SIEM process. In addition, it is advisable to exercise caution if the marketing material states that the organisation would meet certain requirements immediately after the SIEM product is deployed. In practice, this is always just a marketing phrase and should not be trusted as such. When implementing centralised log management, especially SIEM, it should be remembered that Rome was not built in a day. In practice, this means that the number of monitored log sources should be increased gradually and that the organisation’s operating methods should be developed simultaneously as enabled by the available resources. It is often advisable to set up a single log server and try command line-based queries before considering the procurement of a SIEM system.

Log management in practice

Centralised log management and SIEM do not revolutionise the security of organisations as such. Even if the organisation invests resources (procurement, installation, and connection to source systems) in the SIEM system, it will not generate alarms or observations on its own. Commercial SIEM products, in particular, do contain pre-defined identifying data that enable the detection of harmful events. However, each organisation has a unique IT environment, and generic built-in identifiers are not sufficient to provide significant added value in addition to antivirus software. A specific sequence of events may look like an attack in one environment while in another one, it may be part of normal operation.

Page was last updated

Collecting and using log data

An adequate log contains

Timestamp

Transaction and user

Access right

Transaction source

Transaction status

Ensure data protection

Logs for different purposes

Log customisation

How much is appropriate?

Log retention period

Archiving and erasure

Responsibilities related to log processing

Passwords recorded at incorrect locations remain visible

Ensure data protection

Access control

Legislation on logs

Log processing must have a legal basis.

Personal data makes the log a personal data file

Logger’s check list

Log processing requirements are based on, for example, the following legislation

Logging and SIEM

What is SIEM (Security Information and Event Management)?

Deployment of centralised log management

Log management in practice