Oracle: Understanding REDO

In oracle, REDO concepts are not same as UNDO functionality. Like Microsoft word we cannot just REDO any operation after UNDOing. The scope of REDO is limited to the recovery purpose and syncing standby databases. REDO logs don’t hold any data, means REDO logs are actually statements fired by users as part of DML operations. These statements are mapped with SCN(System Change Number). SCN is unique number helps oracle to identify specific REDO operation. In Oracle, REDO logs are initially stored in the Memory (Redo log buffer part of SGA) and whenever user issues commit, Oracle flush all entries to the disk since last SCN change. The activity also takes place every 3rd second and when REDO log buffer is 1/3rd full. This is achieved by Log Writer process (LGWR) runs in the background.

The database must have more than 1 REDO log file and it should be multiplexed. If we do not have more than one file then Oracle will start overwriting same log file again and again. In the case of recovery then we will have only recent data available, therefore we must have more than one REDO log file. Multiplexing means creating more than 1 copy of REDO log files. So if one file set gets corrupted, then another can be used as backup options. This needs to happen parallel for Synchronization otherwise, both log files will have the different set of data which will not be having any use for recovery purpose. Apart from this, we must archive REDO logs files as well for higher recovery options. To achieve this database needs to be open in ARCHIVELOG mode.

How REDO logs are used then? During database recovery process, we can only recover database till the time when the last backup was taken. After this REDO logs are used to REDO all transactions sequentially till crash point. In the case of incomplete Recovery scenario, as well same technique followed to recover data. DBA uses SCN numbers to identify safe point for the recovery process.

REDO logs are also used to bring Standby(Disaster Recovery Site) databases in sync with the production database. There are various sub-categories in this as to how we can transfer data to the Standby sites depends upon how frequently active database is getting updated and impact of data non-availability on the business.

In summary, REDO logs are one of the core aspects of the recovery process and no functionality directly available to the end user usage like UNDO such as rollback or flashback etc.

Advertisements

Oracle: Understanding UNDO Usage

Having effective UNDO functionality is one of the key requirement for your Database. Oracle is working beautifully by managing UNDO functionality. So what is the use of undo and how it is related to the application/batch process coding?

Every uncommitted DML generates undo data, helping the database user to rollback transactions up to certain time. This depends upon the size of UNDO Tablespace and retention time. Mostly housekeeping or batch (delete/update) process, background DML jobs generates more UNDO data as they consist high amount of records. There are some cases when inserts also impact UNDO in the activity index rebalancing in the case of large indexed tables. In the normal scenario, every delete generates good amount UNDO data. Oracle uses this information to maintain data integrity. When database user issues rollback command, oracle UNDO all activities at database level till the last commit. The database uses UNDO tablespaces to store all this information. Also, when batch processes are updating some information; but not committed, and at the same time some database user queries on the same set of information then oracle fetch old data from the UNDO tablespace and rest of the data from the actual tables. In such scenario, Oracle will not allow the user to view uncommitted data maintaining data integrity.

Oracle provides functionality to configure UNDO retention time, means till what period database can store UNDO data. After setting up this value we can enforce database to maintain UNDO data up to given time. If we are setting force undo retention then we must ensure to allocate the right amount of size for UNDO tablespace. Otherwise, we will get cannot extend UNDO tablespace errors.

Oracle also provides flashback query feature nowadays using which we can view dataset in the given time. How far we can go back is depends upon the size of the UNDO tablespace and how much frequently table information is getting updated. In the normal scenario, we can look back up to 24 hours or a week depends on the undo retention parameter if database operations are very quite. In order to run flashback query on the specific table, we need to have flashback grants for that table for non-schema owners. Means earlier after commit there was no option to find what was the value of the dataset earlier unless backup which was not an easy route. Using flashback queries is not difficult, we just need to add “as of timestamp” option and need to provide right value in the past with correct timestamp format.

Oracle : Group By Clause

Aggregate functions are those functions which take one or more input values and provide single output value per chunk of input recordset. In SQL based applications/environment, Aggregate functions (MAX, MIN, AVG, SUM, COUNT) are mostly used. These functions are not only used for generating summarized reports but also to estimate datasets before building any query on the complex tables.

But in some of the cases we cannot use these functions directly e.g. we need to find total salary credited per employee for the given financial year. In this case, we can use SUM function by giving date time range, but that will give the total salary of all employees. We can give employee name in the select clause such as “SELECT EMPLOYEE_NAME, SUM(SALARY) FROM EMP WHRE PAY_DATE BETWEEN :A and :B “. When database tries to run this query it will select employee_name first and then execute second function finding the sum of all salaries. This won’t be correct from data integrity perspective as the database is displaying all salaries against each individual record. Therefore Database will give an error to enforce data integrity. In such cases, we can use group by clause. With the group by we are informing database to summarize output on the basis of the column as below for the above example.

SELECT EMPLOYEE_NAME, SUM(SALARY) FROM EMP

WHRE PAY_DATE BETWEEN :A and :B

Group by EMPLOYEE_NAME;

The simple thumb rule is whenever we are using any aggregate functions then columns on which no functions used must be included in the group by clause. Group by clause will always come after where clause and before the order by or having. Static values are not required to include in the group by clause. P.S. we cannot use any aggregate functions in the where clause.

Sometimes we need to filter output values e.g. Finding employees paid more that 100K in last year. We cannot apply additional filters at where clause since the sum of annual salaries not available. In order to find the result, we can make use of HAVING clause which will work on the output of aggregate functions. With Having clause, we can rewrite above query as below.

SELECT EMPLOYEE_NAME, SUM(SALARY) FROM EMP

WHRE PAY_DATE BETWEEN :A and :B

Group by EMPLOYEE_NAME

HAVING SUM(SALARY) > 100000;

In a summary group by clause is very good functionality provided by oracle and very much effective for filtering and grouping a large amount of data.

Oracle : Basic Tips for writing optimised SQL Queries

SQL queries are one of the core parts in every of the RDBMS-based applications. Most of the times a bad SQL query is the reason for performance and data integrity issues. So how can we build optimized SQL queries?

Every query is different from each other. Therefore we cannot follow the same approach for writing SQL queries. Some of the basic steps are listed below.

  • Understand the expectation of the problem and then analyze respective application tables, columns and relevant primary and foreign key mappings.
  • Always try to use indexed column/primary key in your query. If we do not have this information then you can consult application DBAs to get index information. Most of the times query engine we are using provides functionality for viewing table related information.
  • Avoid using hard parsing means the use of static values rather than passing bind variables. for e.g. in the simple scenario, the application is validating username without passing as bind variables. You can refer to the using bind variables blog for the detailed explanation. This sometimes slows down application performance if usage of the same SQL is high. It will also expose passed values in the SQL query.
  • Forecast growth of your key tables and use right index.
  • Avoid using big sub-queries inside “IN Clause”. In the case of fetching large data, database executes sub-query every time for each fetch. To avoid this use EXISTS clause.
  • For large tables avoid using like statements with % tag on both ends. If it is part of application design then suggest creating function index on the column where like clause is used.
  • Always give preference to direct joins and if you notice redundant information in the tables.

These are some the basic information we need to be aware of writing basic queries. For some people, it could simple, but during my IT career, I have experienced many times lack of usage of basic principles. I will be writing another post on usage of commits and group by clauses.

Avoiding SQL Injection attacks in Coding

SQL Injection in one of the commonly seen of attacks nowadays. SQL injection means when attacker appends a piece of SQL code into the browser script or as part of the input parameter in order to gain access to the side. Most of the times we fail to put basic validations in the code leaving loopholes which can be easily targeted. Therefore when it comes to validation we should not be allowing any special characters which can be used as a valid SQL execution result in giving access to the attacker. But in some cases, it’s a mandate to allow special characters in the input field such as password field. In such scenario, we must ensure that we are taking complete input value as a text which is explained below.

Here we can see an example for login form where validations are not very good. In this case, the application is taking username/password field and checking whether combination exists at the  DB level and then allows access to the user. We can consider following SQL used to validate login form where both user and password data stored in login_user table.

sqlQuery = “SELECT count(1) FROM LOGIN_USER” +
” WHERE USER_ID =  ‘” + inputUserNameSring + “‘and password = ‘ ” + inputPasswordSring + “‘” ;

If I enter User Name = test and password = test assuming that the combination is not present at DB level. The login form will not provide me access to the main form.

sql_a

But what if I enter the following string in the password field “test’ or ‘1’ = ‘1“. Since we are not filtering password string completely to text field single quote and equal to is becoming part of SQL query. From SQL point of view, this will add another OR condition which is already true which means it will return the count of all rows from the DB as below.

sql_b

As a result, application will grant access to the main form without valid user and password combination. This type of injection is most commonly used to bypass validation forms. Therefore to avoid such bugs we must convert the entire input string into the text field. In the above case, I have now replaced every single quote to additional single quote which will then consider as text and not part of SQL query. Therefore  “test’ or ‘1’ = ‘1” is now changed to  “test” or ”1” = ”1” as below and when it will get executed at DB level will not return any output.

sql_c

As seen above now we have validated input value passed in password field exactly at Database avoiding SQL injections.

The example provided above is the most basic example to explain SQL injections. Hackers do use advanced level of injections, but if we ensure that all critical input parameters are getting validated without leaving any loopholes then such type of attacks can be easily avoided.

Oracle – Checking LOB Segment Usage

In oracle, Lob datatype is stored as a separate segment. If your table consists any LOB datatype then its usage will not be displayed under table usage data. In order to find usage made by LOB segments under given schema following query can be used.

SELECT DL.OWNER,DL.TABLE_NAME,DL.COLUMN_NAME,SUM(DS.BYTES)/1024/1024/1024 FROM DBA_LOBS DL,DBA_SEGMENTS DS
WHERE DS.SEGMENT_NAME = DL.SEGMENT_NAME
AND DS.TABLESPACE_NAME = ‘USERS‘ AND DS.SEGMENT_TYPE = ‘LOBSEGMENT’
AND DS.OWNER LIKE ‘SCHEMA_NAME%’
GROUP BY DL.OWNER,DL.TABLE_NAME,DL.COLUMN_NAME
ORDER BY 4 DESC;

If above query not showing correct data then find correct tablespace name and replace in above query

Oracle – Viewing Foreign Key Mappings

Following command can be used to view all primary to foreign key mappings under given User’s schema. Please provide correct User name in the below query.

SELECT CON_TAB_M.TABLE_NAME FK_TABLE,CON_COL_M.COLUMN_NAME FK_COLUMN,CON_TAB_R.TABLE_NAME PK_TABLE,
CON_COL_R.COLUMN_NAME PK_COLUMN,CON_TAB_M.CONSTRAINT_NAME FK_CONSTRAINT,CON_TAB_R.CONSTRAINT_NAME PK_CONST
FROM DBA_CONSTRAINTS CON_TAB_M,DBA_CONSTRAINTS CON_TAB_R,DBA_CONS_COLUMNS CON_COL_M,DBA_CONS_COLUMNS CON_COL_R
WHERE CON_TAB_M.OWNER = :Schema_Name AND CON_TAB_M.CONSTRAINT_TYPE = ‘R’
AND CON_TAB_M.OWNER = CON_COL_M.OWNER AND CON_TAB_M.TABLE_NAME = CON_COL_M.TABLE_NAME
AND CON_TAB_M.CONSTRAINT_NAME = CON_COL_M.CONSTRAINT_NAME
AND CON_TAB_M.R_OWNER = CON_TAB_R.OWNER AND CON_TAB_M.R_CONSTRAINT_NAME = CON_TAB_R.CONSTRAINT_NAME
AND CON_TAB_R.OWNER = CON_COL_R.OWNER AND CON_TAB_R.TABLE_NAME = CON_COL_R.TABLE_NAME
AND CON_TAB_R.CONSTRAINT_NAME = CON_COL_R.CONSTRAINT_NAME
ORDER BY 1;