Signup/Sign In

Big Data Online Practice Test - 10

This Test will cover complete Big Data with very important questions, starting off from basics to advanced level.
Q. A lead was telling his team,the concept of heartbeat in Hadoop. His explanation was as follows:
1) Datanode sends heartbeat to Namenode,which also contains information,about data transfers taking place,total storage capacity etc.
2) A Namenode waits for around ten mins,before it considers Datanode,to be unavailable.
3) Once a datanode,is considered not available, replication of blocks,stored on that datanode to other working datanodes starts.
4) Datanode does not sends a block report,along with the heartbeat to the Namenode.
Mark the correct option.
Q. In an interview,following code snippet was given :

object MyClass {
      def main(args: Array[String]) {
         for (xyz <- 0 until 5)
         println(xyz)         
         for (abc <- 0 to 5)
         println(abc)         
      }
   }
  
Mark the correct outputs of 'xyz' and 'abc' print values.
Q. A developer was told to display,all the documents,from the Ports collection,where the Shipnames begin with "T".
Mark the correct syntax.
Q. A developer was told to write the features of following query.
select * from ports CLUSTER BY portId;
What all statements should she include in her answer?
1) It is an alternative,for distribute by query.
2) Cluster by columns go,to only one reducer
3) It is an alternative,for sort by query.
4) It does distribution and sorting,on the same columns.
Mark the correct option.

Q. A group of leads,were discussing about the Hdfs corrupt blocks situation. Statements raised were as follows:
Lead A - HDFS fsck commands,helps to identify the corrupt blocks,and,provides options on how to fix them.
Lead B - HDFS fsck operates only on metadata.Its an offline process.
Lead C - The Namenode recovery is an online process unlike fsck.It recovers the corrupted blocks.
Lead D - Recovering the corrupted copy of Edit Log,is always better,than,using another valid copy present, of the same.This helps in saving considerable time.
Mark the correct option.
Q. A developer was told to combine,multiple mapreduce output files,given by the reducers.
He was struggling with doing the same.
Which of the following commands will help him achieve his task?
Q. A developer was running the following command in the Cassandra shell:
cqlsh:dev> UPDATE ports SET shipment='Yes',shipnumber=523 WHERE portid=3;
Consider the following scenarios.
1) If row already exists, then the data gets updated.
2) If row does not exists, then an error is thrown saying id unavailable.
3) If row does not exists,a fresh new row gets inserted.
4) If row does not exists, then an error is thrown saying id unavailable , use UPSERT.
Mark the correct option.
Q. Map the configuration settings,with the files,they are found in.File Names given are:
mapred-site.xml(a), core-site.xml(b),hdfs-site.xml(c), hadoop-env.sh(d), yarn-.site.xml(e).
Descriptions are -
1) Set the Java environment variable.
2) Data node paths of local file system.
3) Port number used for the Hadoop instance.
4) The interval at which,the MRAppMaster,sends heartbeat,to the resource Manager.
Map the descriptions with the files in which they are present.
Q. A developer,working in PIG, was loading a stations.tsv file, as follow:
Y = load 'path/filepath/stations.tsv' AS (stationname:chararray,id:int,ticket:float);
Z= foreach Y generate stationname,ticket;
DUMP Z;
Mark the correct statement.

Q. A few descriptions,about the Hive and Pig framework, were provided in a test. Students were asked to mark, which of them were describing Hive or Pig:
1) Loads data quickly and effectively.
2) This component mainly operates,on the server side of the cluster.
3) It is easy to write User Defined Functions.
4) It easily supports Avro file format.
Mark the correct option.

Related Tests: