Friday, 20 February 2015

Parallel Processing: the Damien Hirst of SQL Tuning Art World

After spending a decade of my life among people whom I eventually came to regard generally as imposters (i.e. journalists), I learned something important about myself: I'm a purist committed to "principles" and I hate bullshitters!

But even if you ignore the pointless self-glorification of the first paragraph, you must have understood from the title that I'm not a fan of Damien Hirst!

Taking The "SQL Tuning Expert" certification exam (1Z0-117) recently reminded me that perhaps Damien Hirst is not the only entity which -in my opinion- is placed in a context that it doesn't belong to; or AT LEAST maybe we shouldn't regard people like him as such "integral" parts of the art world.

The following is a review of subjects covered in the exam, and what was overemphasised.

Here It Goes!

Disclaimer: If you've found your way to this post hoping that you might find some exam dumps or samples, you can press "Back" on your browser RIGHT NOW!

My only aim is to share what I remember (I will update if/when I remember more) from the test about what needs to be learned so that an applicant can pass the exam to achieve "SQL Tuning Expert" certification, mainly because not even a sample test was provided by Oracle.

I divide this post to three parts:

  1. Main Subjects
  2. References and Resources
  3. A Demand for Keeping Things In Perspective

1. Main Subjects 

You can find the exam topics here, but I did not find it representative of the main emphasis points of the exam; so here's what the actual main (not all of them, and not in the order of emphasis in the exam!) topics:
  1. Reading Execution Plans
  2. Query Transformation
  3. Partition Pruning
  4. Parallel Processing
  5. DBMS_MONITOR/trcsess/tkprof 
  6. SQLTuningSets, SQL Tuning Advisor, and SQL Access Advisor
  7. SQL Profiles and SQL Plan Management

1.1. Reading Execution Plans


The plans that you come across in the exam are not complicated (especially the stats related issues, such as cardinality mismatches and manifestations of stale stats are generally obvious), and you must have seen them all, if you have gone through the Oracle documentation mentioned below under "References and Resources."

  • You should be loving this stuff! If you don't like cracking the execution plans code, why on earth are you taking this exam anyway?!
  • You should be familiar with different interpretation methods of execution plans in terms of finding out the chronology of the events and processes involved.
  • More or less every SQL Tuning case that is mentioned in the official exam topic is referenced in the exam through an execution plan.


 1.2. Query Transformation


  • When do either of View Merging, Subquery Unnesting and Query Rewrite happen?
  • When either of those can't happen?
  • What are the signs of each taking place, in an execution plan?
  • When would you see a "Projection View" during View Merging?
  • How do you distinguish a Bitmap Join an Star Transformation?


1.3. Partition Pruning


  • What is the sign of Partition Pruning NOT taking place on a partitioned table, in execution plan?
  • Generally how can predicates affect partition pruning in different partition methods? for example what can cause a Full Table Scan of a partitioned table? What kind of predicates will facilitate partition pruning in Range, List, Hash, or System partitioned tables?
  • Full/Partial partition-wise joins: When will they happen in parallel?
  • You need to keep an eye on the "IN-OUT" column of the execution plans on parallel partition-wise joins and "S -> P" operations: not all the tables are read in parallel!
  • What is System Partitioning?


1.4. Parallel Processing


You need to know about the following concepts:

  • How do different values of PARALLEL_DEGREE_POLICY(AUTO|MANUAL|LIMITED) affect the behaviour of the database?
  • How the DOP for a SQL statement is calculated under either of the modes mentioned above?
  • Under which circumstances the Auto DOP is used?
  • How is DOP calculated in each mode?
  • In each mode of PARALLEL_DEGREE_POLICY  when a statement is run in parallel; i.e. using which syntax (i.e. hint) or using which parallel settings for the segments involved?
  • Under which conditions parallel SQL statements are queued?
  • How/when does the value of PARALLEL_MIN_TIME_THRESHOLD affects the execution of a SQL statement?
  • Under which circumstances CTAS and INSERT ... SELECT can run in parallel? and when does the "SELECT" part of both operations does [not] run in parallel?
  • What could cause either an ORA-7454 or ORA-12827 during parallel execution?
  • This is a tricky one: When does a direct-path insert into an IOT NOT take place in parallel even if IOT is created in parallel mode?
  • How can Resource Management be deployed to provide services with different parallel processing characteristics (e.g. statement queuing)?
  • Under which mode of PARALLEL_DEGREE_POLICY, and how does lack of I/O Calibration affect parallel processing?
  • When should PARALLEL(MANUAL) hint be used?
  • When can In-Memory parallel execution be employed? Where is the data kept in such environment? (PGA vs. Buffer Cache)
  • Under which circumstances DML statements can't be executed in parallel?
  • How does parallel execution in RAC takes place?
  • What's the effect of either of the following parameters?


1.5. DBMS_MONITOR/trcsess/tkprof


  • Which kinds of trace (i.e. which DBMS_MONITOR procs) could produce multiple trace files?
  • You need to be familiar with the sequence of actions from generating trace to using trcsess and eventually tkprof.
  • Which program consolidates trace from different trace files? trcsess? or tkprof?
  • How does tkprof treat recursive calls?
  • tkprof parameters; for example what options are available for sorting the output items?
  • How many trace files are produced by for each


 1.6. SQL Tuning Sets, SQL Tuning Adviso, SQL Access Advisor 


  • What can be kept in an STS? The syntax of different STS management operations do not matter, as much as a full grasp on the process of STS management matters.
  • What are the data sources for Automatic and Manual invocations of SQL Tuning Advisor?
  • What are the sources for SQL Access advisor?
  • What is the functional relationship between STS, Tuning Advisor, and Access Advisor? Which one is used by which?
  • What are the output of Tuning Advisor and Access Advisor? for example which one issues recommendation regarding deploying materialized views?
  • What is the role of Automatic Tuning Optimizer?


1.7. SQL Profiles and SQL Plan Management


  •  Which initialization parameters would facilitate the discovery/storage and using of the plans by optimizer?
  • What is the role of Automatic Tuning Optimizer in an ecosystem which also includes SQL Tuning Advisor, SQL Profiles, and SPM mechanism? How are the dots connected to each other? You need to know the data flow between these elements.
  • What are the different steps in the life cycle of a plan? How does a "Fixed" plan differ from an "Accepted" one?
  • What happens when a plan changes? What does "Evolution" entail?
  • How do you manage different plans for different purposes? e.g. Batch processing vs.other times


2. References and Resources


I suggest the following resources would be adequate for passing the certification exam:

  • Database SQL Tuning Guide in here
  • Database VLDB and Partitioning Guide in here
  • Troubleshooting Oracle Performance, 2nd Ed (2014), by Christian Antognini via Apress

 3. A Demand for Keeping Things In Perspective


Although the subject of Parallel Processing is mentioned in less than 20% of subjects outlined in the official exam topics, it actually was referenced in about half of the 75 questions in one way or other! So if you're not proficient in concepts of parallel processing, there's a big chance that you will not pass this exam.

Implications of such huge emphasis on the subject are somewhat confusing; especially considering the fact that "throwing resources at a problem" is not exactly an example of "Tuning."

Parallel Processing is as much of a "SQL Tuning" technique that Damien Hirst is a representative of Art, compared to artists such as Picasso, Dali, or even Anthony Gormley.

If -like me- you have a background which makes you prone to being open to conspiracy theories, you could even call it "a sign of Oracle's hidden agenda to drive the demand for more beefy ('Engineered') servers!"

But less naive exam takers could simply refer to few references to the concept of Resource Management in the exam, and simply call it "disproportionate" to the number of questions about Parallel Processing.

I believe the subject of Parallel Processing could be addressed in an exam such as Performance Tuning (1Z0-054 or 1Z0-064), AND alongside concepts like instance caging.

At least I tend to think that "SQL Tuning" is more of an art form, and Parallel Processing is the sledge hammer that Rodin never used and never would have used for creating "The Thinker."

Update: I'd like to thank Horia Berca for guiding me to slides from Tom Kyte's Optimizer Masterclass. The Optimizer Masterclass is not the preparation/training course for 1Z0-117, but It made me happy to see how the amount of emphasis that one of Oracle's gurus has put on the subject of Parallel Processing seems to bolster my position regarding the subject, and what I've expressed in this post.

Wednesday, 19 March 2014

Observations on an Oracle bug: You can't CTAS an IOT with Timestamp(n)!

If you are planning on loading data into a partitioned IOT with a "Timestamp(n)" field using CTAS, well, you can't!

I came across the issue a while ago, and had it confirmed by Oracle as a bug, which manifests itself even in 12cR1.

Here is the scenario:

You would require to load an IOT (named EMP) by adding new partitions and exchanging with another IOT that's produced using a CTAS.

This is basically what you would need to do:
HireDate TIMESTAMP(3),

FROM dual;


And That's when you're hit in the face by a:
SQL Error: ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
14097. 00000 -  "column type or size mismatch in ALTER TABLE EXCHANGE PARTITION"
*Cause:    The corresponding columns in the tables specified in the
           ALTER TABLE EXCHANGE PARTITION are of different type or size
*Action:   Ensure that the two tables have the same number of columns
           with the same type and size.
Obviously you will check:
desc emp
Name     Null     Type      
-------- -------- ------------

desc temp
Name     Null     Type      
-------- -------- ------------
To prove that it is actually possible to exchange partition with the IOT:
HireDate TIMESTAMP(3),


And you're politely made aware of the fact that:
"table EMP altered."

What's going on here?

Well, the following simple query would shine enough light on the issue:


---------- ----------- ---------- ------------ -----------
EMP        DEPTID               1 NUMBER                22
TEMP       DEPTID               1 NUMBER                22
TEMP1      DEPTID               1 NUMBER                22
EMP        EMPID                2 NUMBER                22
TEMP       EMPID                2 NUMBER                22
TEMP1      EMPID                2 NUMBER                22
EMP        HIREDATE             3 TIMESTAMP(3)          11
TEMP       HIREDATE             3 TIMESTAMP(3)          20
TEMP1      HIREDATE             3 TIMESTAMP(3)          11 

As you can see CAST(column as TIMESTAMP(n)) would merely manifest itself as a TIMESTAMP(n) but in fact, it's not different than a "TIMESTAMP!"


Modify the source of your CTAS operation to actually hold a "TIMESTAMP(n)" value, and get rid of the CAST function. Although the function works for other data types.

Sunday, 2 February 2014

The challenging journey to overcome lack of knowledge

It should be called a crime when people liken the process of "gaining knowledge" to simply "solving a puzzle."

At least we all have to agree that it's an extremely patronising way of describing the process; mainly because the person in the position of power (in this case "knowledge"), the same person who's describing the process of "gaining knowledge," fails to address the characteristics of that "puzzle" honestly or correctly.

Knowledge has no shape

This is for people like me, who aspire to be good at what they do; those who have a dream to be true professional -no matter whether they put in enough effort or not- and they are very well aware of the fact that "getting better" is directly associated with gaining more knowledge.

This specifically holds for people who aspire to carry the title of "Oracle Scientist."

The fact of the matter is: knowledge is a puzzle; but with no shape or specific boundaries.

You keep discovering new pieces but you have no idea where -in the broad context of this shapeless puzzle- the pieces sit.

And in the case of Oracle, most of the time the pieces can sit right in the middle of the puzzle!

Embrace the pilgrimage!

I'd have to apologise for committing "the crime" too; because the process of gaining knowledge is no journey.

It's a pilgrimage.

You would know it's a pilgrimage when you overcome the initial fear that takes over you, when you realise that the "puzzle piece" you just found can ONLY fit in the middle, not directly connected to any other pieces you've previously found, but can be potentially connected to either of them indirectly!

Easy answers are rare, the path is long, and above all -as Connor McDonald rightfully quotes- "Learning is not a spectator sport!"

Find a guiding light!

Nobody has done it on his own, EVER.

I've been reminded of this, time and time again during the course of my life, and most lately by Kirby Ferguson.

You need to be surrounded by knowledgable people; the kind of people who don't give you the answer, but are patient enough with you so that you'd learn the path to acquire that answer.

There's no shortage of such people in the Oracle community, if you've got the will to find them.

Wednesday, 25 December 2013

IMPDP/EXPDP, Virtual Columns, and user-defined functions

If you have virtual columns based on user defined functions in a schema, you can not use IMPDP/EXPDP for that schema; not that easily, at least.

In this post I would lay out a semantic scheme for a test, and propose a successfully-tested work-around.

Required material for the scenario:

- A schema named VCOL_TEST, containing:
- A function named VCOL_FUNCTION
- A table named VCOL_TABLE

The script below can be used to create such schema:

create user vcol_test identified by vcol_test;

grant   create session,
        select any dictionary,
        create procedure,
        create table,
        debug connect session,
        debug any procedure
to vcol_test;

conn vcol_test/vcol_test;

create or replace function vcol_function(p_parameter number)
return number
  if p_parameter > 10 then
    return 10;
    return 1;
  end if;
end vcol_function;

show errors;

create table vcol_table(id number, vcol as(vcol_function(id)));


What's the problem?

If you try to use IMPDP to create a "remapped" schema (e.g. VCOL_TEST2) on a different server, you would come across this:

Processing object type SCHEMA_EXPORT/TABLE/TABLE
ORA-39083: Object type TABLE:"VCOL_TEST2"."VCOL_TABLE" failed to create with error:
ORA-00904: "VCOL_TEST"."VCOL_FUNCTION": invalid identifier
Failing sql is:

Job "SYSTEM"."SYS_IMPORT_SCHEMA_01" completed with 1 error(s)
What's happening here?

Checking the "Failing sql" would manifest the bug: IMPDP is trying to create table VCOL_TEST2.VCOL_TABLE referring to function VCOL_TEST.VCOL_FUNCTION (as opposed to VCOL_TEST2.VCOL_FUNCTION).

i.e. IMPDP is not able to remap the owner of the function used for the virtual column, to the new schema name.

What does Oracle say about this?

Officially, nothing.

Although a consulting member of technical staff at Oracle was kind enough to send the following response:

"In releases prior to, a table with a virtual column based on the result of a PL/SQL function would not be recreated successfully on Import, even when REMAP_SCHEMA was not used. This is because the necessary PL/SQL function would not have been created before the table was created.

The problem was addressed by modifying Import to create the table with a temporary definition for virtual columns, then creating the function, then modifying the definition of the virtual columns to use the correct function.

The initial fix was in BUG#10186633 and this was later extended in BUG#14506804. The extended fix is also included in certain patch set updates and one-off patches for and

Unfortunately this fix does not address the problem of REMAP_SCHEMA. Even in or, it is still possible to encounter a failure during Import, or a failure during runtime, depending on the privileges of the owner of the target schema and the existence of the source schema.

The work-around

Since re-importing and remapping existing test schemas is part of our daily tasks, I have developed a bash script which follows these steps:

In Export:
1. Create a list of user defined functions which are used in virtual columns, and according to their dependencies, use DBMS_METADATA to create a SQL script which can recreate the functions with no reference to the schema name.

2. Create a list of tables whose virtual columns rely on the exported functions, and (using DBMS_METADATA, again) build a schema-independent SQL script to recreate them.

3. Obviously you would need to build another script to grant quotas on different tablespaces that would hold the concerning tables, to the new user/schema.
4. Export as usual, using EXPDP.

In Import:
1. Create the new schema, grant all the privileges and quotas according to the source schema.
2. Use the script created in step 1, above, to create the functions.
3. Use the script created in step 2, above, to create the concerning tables.
4. IMPDP everything EXCLUDING the tables mentioned in the previous step. Ignore the "already exists" messages during the process.
5. IMPDP DATA_ONLY for the tables mentioned in previous step.

IMPDP is done in two passes, due to the fact that the existence of the already-created concerning tables (in step 3 of Import) would definitely lead to "already exist" errors, which would prevent importing the data for the tables.

Note: The method above only works, if -like us- the concerning tables do not hold any referential constraints to any other tables (our "solution" is a datawarehouse and the concerning tables are "fact" tables whose integrity constraints -if any- are enforced through application).

If they did, you might have to make sure that their "child" tables -if any- are imported after they are.

Final Note

The same problem could occur for function-based indexes as well, and it does not seem to be an issue that Oracle DBAs can overlook.

Although Virtual Columns are extremely helpful, But as Matthias Rogel explains in an example their implementation has not been without issues.

Fortunately this time, based on an Oracle insider, the Datapump team are aware of the bug.

There's still hope!

Sunday, 8 December 2013

Real World Performance; A course in its infancy

Clarity is the most vital element of the "learning" process.

If both parties involved in the process (i.e. teachers & students) do not have a clear idea about the subjects of the study, the ideal target audience, and the presentation methods, the ultimate goal -the effective knowledge transfer, that is- would not be fully realized.

At this moment (December 2013) Real World Performance training sessions offered by Oracle does not seem to be clear about some of the factors.

belated disclaimer: this is a long post! But if you are an anti-corporation (specifically Oracle) kind of person, you would find some interesting material along the way!

The Target Audience

The 3.5 days course was designed to cover three general areas of SQL, OLTP, and Data Warehousing so that audience could learn to "apply the correct performance techniques to get the best Oracle Database and Exadata performance."

Although I could not find a member of the audience who assumed Exadata would not be central to this course, not even a five minutes compact Exadata architecture presentation was offered; and that's despite the fact that the level of exposure to Exadata was one of the earliest questions that were asked from the audience, and some -including yours truly- responded: "not much."

Two days after the course, I'm still struggling to work out that whether the course designers knew whom they wanted as audience, and what technical abilities/qualities should their ideal audience have.

Subjects of Study

The SQL part started with clarifying different elements in the structure of a SELECT statement, followed by Joins, and Analytical/Window functions, and then straight into an examination of a parallel query's execution plan; while completely overlooking an explanation on how to interpret an Execution Plan.

(How many people are not clear on what does "Row Source" mean, but have a clear knowledge of how to interpret an Execution plan?!)

A presentation titled "Why my SQL is slow" followed with emphasis on importance of CBO having access to correct cardinality estimates, choosing the correct degree of parallelism, when to use indexes and histograms, the effect of using Flash Temp, choosing the right size for PGA, and correct parallel distribution method (see Parallel Execution), seeding column usage, and using extended stats!

All in about an hour (!) with not so clear presentation slides (see Demo Tools). The following presentation on "SQL Tuning tool" (see A Presenter's Cardinal Sin, below) was definitely not the highlight of the course!


Graham Wood's presentation on OLTP had a major point of emphasis, based on the graph presented below:

Comparing min/max/avg number of transactions against number of processes per cpu
(Copyright Oracle, 2012)

The test case clearly shown that increasing the number of parallel threads to a DB would lead to reduction in throughput, not an increase => control the number of connections from the middle-tier to the database.

(This often-overlooked point was presented in a presentation by Andrew Holdsworth -known as "The Boss" by the Real World Performance team members!)

Graham Wood warns against using dynamic connection pools, and referred to the following, as the possible culprit in poor performance scenarios:
  • Excessive logon/logout
  • Invalid SQL (leads to "SQL*Net Break/Reset")
  • Cursor leaks (if you'd need to increase "open_cursors" something is wrong)
  • Session leaks (exceptions happen, but middle tier doesn't close session => high # of idle sessions, free mem is reduced, idle/leaked session might hold a lock ...)
  • Lock leaking (side effect of session leak, no rollback/commit in exception handler)
Data Warehousing

Although the Real World Performance team admitted that this was the first time that the subject was included in the course, but Mike Hallas made a great presentation out of it.

He was knowledgable and experienced, so much so that you wished he had an alternative outlet (blog, twitter, etc) to share his experience outside of the framework of OOW or Oracle's training courses.

He compared row-by-row, Array based, home-grown parallelism, and set based processes during the ETL/ELT operations.

Although based on the available slides, it appears that the course simply ran out of time to specifically talk about different indexing strategies for OLTP, and on Exadata, and also Partitioning pitfalls.

Parallel Execution

This was the ever present concept throughout the course, and rightly so, considering the architecture of Exadata and its capabilities, and also the globally dominant thirst to achieve 100% efficiency from any Oracle box.

A unique real-world presentation was made with members of the audience collaborating, to show the difference between Hash and Broadcast distribution methods used between Parallel consumers and producers (While Bjorn Engsig acted as Query Coordinator!). Although it could have been completed by association to a semantic query.

The clear distinction between the use-cases of the two method, and how to influence either was one of the main plus points of the course.

But considering the level of emphasis on the subject, perhaps having a compact and objective presentation on the following points could be beneficial:
  • How to enable parallel processing for an object
  • How to ensure read/write in parallel
  • What could disable/block parallel operations
  • Effects of parallel write on compression
  • Clarification on parallel operations and logging

Impressive Demo Tools

The presenters of Oracle's Real World Performance group were using a number of fantastic demo platforms (apparently home grown) to showcase different performance tuning scenarios.

Graham Wood used one of the more effective ones for a presentation on how to balance the workload, throughput and the number of connections to the DB, during a day of talk on OLTP.

A very informative presentation on the factors contributing to "Why is my SQL Slow" used a less effective platform, mainly because SQL Monitor output had much more detail and was not completely readable/comprehensible at times, because the words and characters simply too small.

Presentation Demo Tool

Mike Hallas who served as the main presenter of the course, used another of the more effective platforms to present a comparison of different operations on a data warehouse.

A Presenter's Cardinal Sin!

The fact that you would never hear a sentence like "here's the boring bit!" in an OakTable presentation proves that you don't need to have spent a decade in journalism -as I've done- to know that you NEVER EVER should under-sell your material!

And that's the sentence we heard when the talk on "SQL Tuning tools" began.

Such a sentence is an indication of lack of enthusiasm, and if you do not feel enthusiastic enough about a subject that you're presenting, then why are you presenting it?!

That became a double-blow for members of audience like me, who were expecting and hoping that the course would take a step to enable a DBA's hand as much as his/her mind, and were hoping sufficient amount of time would be spent on trying our hands on different tuning tools, under guidance from Oracle guys.

Perhaps the false expectations were due to lack of clarity from Oracle regarding the concepts that were about to be covered in the hands-on sessions and the way they were to be covered.

But the "SQL Tuning tools" remained "the boring bit" till the end.

Observations of an Oracle newbie

All in all, the course seemed to cover too many subjects in too little time, at least for hands-on experiments.

Also the clear distinction between the general behaviour of an Exadata system and a normal Oracle box was made in an ad-hoc way, rather than a systematic one; there were no plans to present a "These operations (and only these) would be different on Exadata" talk.

There were no mention of Exadata pitfalls, or "What Exadata isn't good for." (Am I too naive to expect such a discussion in an official Oracle training? By the way, If you're interested in the subject, you can watch this OakTable presentation by Jonathan Lewis.)

And the final observation (anti-Oracle corporation readers would love this):

EVERYBODY quotes "Tom's" quotes, expressions ("slow-by-slow"), jokes, stories, ... and only "Tom's."

Or maybe I'm wrong.

Thursday, 7 November 2013

DBA_SEGMENTS Vs. DBA_EXTENTS for calculating data size

If you are dealing with a single datafile tablespace and you'd need to know the size of a specific chunk of data, you need to use DBA_SEGMENTS!

Here's a simple experiment on tablespaces with 32K Blocks, on ASM and, and PARALLEL_DEGREE_POLICY set to MANUAL:

-- Result: 8346468352 IN 162.67 SECs

-- Result: 8346468352 IN 3.687 SECs
The SQL statements were run exactly in the same order that are displayed, but to make sure that the caching of the base segments (like SYS_OBJECTS view or any other SYS schema object) is not contributing to the achieved time, shared pool was flushed and SQL statements were run in reverse order. Same results.

The tablespace used has two datafiles, but same statements were also tried on single file tablespaces and again, the same results.

Obviously DBA_EXTENT would allow you to query based on FILE_ID, which -unfortunately- will be crucial when you are interested to know about segments/extents on one of the files. In my case, I need it for a "datafile defragmentation" proc, which requires the BLOCK_ID of extents; hence DBA_EXTENTS.

Side note: The SUM(BYTES) would return the total sum of the size of the blocks that contain data; not the actual size of the data, although I'm not sure whether that matters or not!

Post-publish-newbie-Observation: Just noticed that DBA_SEGMENTS is entirely based on SYS.SYS_DBA_SEGS, whose HEADER_FILE column represents FILE_ID.

Also that the "Bytes" column in DBA_EXTENTS/DBA_SEGMENTS is calculated using SYS_DBA_SEGS based in the following (ugly looking, but time saving!) formula:

decode(bitand(segment_flags, 131072), 131072, blocks,  (decode(bitand(segment_flags, 1), 1, dbms_space_admin.segment_number_blocks(tablespace_id,relative_fno, header_block, segment_type_id, buffer_pool_id,segment_flags,
segment_objd, blocks), blocks)))*blocksize

Regarding the EXTENT_ID: what you need is in sys.x$ktfbue.ktfbuebno; that is, if you need a precise BLOCK_ID, but if -like me, using a 5MB margin for resizing datafiles- a "more-or-less" precise BLOCK_ID does the job, then your answer lies in SYS.SYS_DBA_SEGS.HEADER_BLOCK.

It is either the same as the BLOCK_ID you'd get from sys.x$ktfbue.ktfbuebno and sys.uet$.block# (that is DBA_EXTENTS.BLOCK_ID) or it's only a mere 2 blocks higher!

The king is dead! All hail the king!

Sunday, 3 November 2013

Oracle and releasing documentation: Do it, or don't!

Something is fundamentally wrong here!

I was watching a great presentation by Carlos Sierra on Adaptive Cursor Sharing, similar to what he presented in OakTable 2013 that I was present in (Yes, I chose Carlos' talk over Jonathan Lewis' on 12c Histograms! What are you gonna do about it?!), just to remind myself of the critical points made about the subject.

If you have seen the presentation (around 01:05:45+) or have been present at his talk, you'd know that there is a certain point in the presentation that EVERYBODY seems to be confused.

The confusion is about how the ranges of row counts in three different buckets -that define the buckets, are calculated. Carlos has explicitly admitted to the issue in his article"Observations on a test case built on seem to indicate that boundaries for these 3 buckets are: less than 1,000; between 1,000 and 1,000,000; and more than 1,000,000."

Carlos is a person who rightly praises OakTable members using a definition that members -rightly, again- attribute to themselves; i.e. Oracle Scientists. But isn't it a bit strange to call people who study the behaviour of a man-made entity, scientists?! Would you call the greatest mechanics you know, (for example) Ford Scientists?!

And that brings me to my point: I've been privileged to be among the members of this group on the Summer of 2013, and I know that "Scientist" is only one of many titles you could choose to describe them with. But the absurdity of the title arises from the unprofessional behaviour of Orcale itself.

I have absolutely no problem with Oracle keeping its documentation secret; after all they've spent millions of man-hours to develop a capable piece of software, but you'd have to make a choice here: either -for whatever reason you choose to do so- you release ALL your documentation, (likes of which you can find here), or you don't do any of it.

The fact that many professional like OakTable members have to do research to find out about the behaviour of a man-made product (as opposed to a natural phenomena/being) whose creator possesses the information about its inner-workings, can only make an independent observer more intent on judging the creator as "yet another corporation" or -in a best case scenario, "not able to provide an association between the size of the company and the quality of its product," while it can be regarded as an actual player in the field of data science.

The ambiguities such as the ones about the ACS could only bolster the position of those who might ask: Is Oracle releasing its documentation -the way it has- merely a PR move?