Questions tagged [cloudera-cdp]
The cloudera-cdp tag has no usage guidance, but it has a tag wiki.
26
questions
3
votes
0
answers
792
views
How do I Create Hive External table on top of ECS S3 object storage using "S3a//" protocol
I am trying to create Hive external table using Beeline on top of S3 object storage using "S3a//" scheme.I have followed the official cloudera documentation and configured the below ...
1
vote
1
answer
4k
views
Unable to create Managed Hive Table after Hortonworks (HDP) to Cloudera (CDP) migration
We are testing our Hadoop applications as part of migrating from Hortonworks Data Platform (HDP v3.x) to Cloudera Data Platform (CDP) version 7.1. While testing, we found below issue while trying to ...
1
vote
1
answer
1k
views
Read/Write with Nifi to Kafka in Cloudera Data Platform CDP public cloud
Nifi and Kafka are now both available in Cloudera Data Platform, CDP public cloud. Nifi is great at talking to everything and Kafka is a mainstream message bus, I just wondered:
What are the minimal ...
1
vote
1
answer
580
views
How to migrate roles from one apache ranger instance to other instance?
We are planning to make a replica cluster of existing CDP cluster. I can import/export policies but can not import/export roles.
We have around 2k+ roles, using following api I can create role but ...
1
vote
0
answers
38
views
How can I process FHIR data in Cloudera CDP environment?
I'm developing a NiFi pipeline in CDP to process FHIR data that are stored in an external DB. Is there any specific tool that I can use in Apache NiFi to read and manipulate FHIR data? Or, as ...
1
vote
0
answers
18
views
In CDP how to update OneViewofProfile Id through (VisitorId, browserID)?
anyone from CDP certified guide me on this. what is the better approach to update CDP profile id, and how these below use cases are same. When 2 user are using same device, do they got the same ...
1
vote
1
answer
1k
views
Hive managed table issue to create a hive table from a hdfs location in CDP
I have a CDP 7.3.1 where using sqoop , I have loaded data from Postgres database table into HDFS location /ts/gp/node. Now I am trying to create a hive table on this. I get the below error. Please ...
1
vote
1
answer
526
views
Scala - How to read MQ message which exceed 4096 characters
Application Information:
IBM MQ 9.2,
Cloudera CDP 7.1.6,
Spark 2.4.5
I am upgrading the spark code from Spark 1.6 to Spark 2.4.5.
I have a json content (complex schema) push to the MQ Queue which the ...
0
votes
1
answer
287
views
Programmatic way to find the cluster version from CDSW - Cloudera Data Science Workbench
Is there any programmatic way to find out the cluster version(CDH6 or CDP7) from a CDSW session?
Could any environment variable give a fool-proof way to determine the cluster version?
0
votes
1
answer
174
views
Connecting to Impala DB using Dask Library
I am trying to connect to Impala DB through Dask Library to fetch all data from a table using the read_sql_table(). Need the connection string to connect to, I have tried using the connection string ...
0
votes
1
answer
322
views
Connect HBase via Knox using HBase Java Client on CDP
I need to connect to HBase via Knox using HBase Java Client. I have Knox details as following
Knox_Url: https://knox-host:port/gateway/cdp-proxy-api/hbase
Username: knox_user_name
Password: ...
0
votes
0
answers
3
views
COMPUTE STATS IMPALA results in DiskErrorException
I'm trying to execute a compute stats (COMPUTE STATS db.table;) on one of my tables via IMPALA (on a ClouderaDataPlatform), but for this table only I'm encountering the following error:
...
0
votes
0
answers
7
views
Getting NoSuchMethodError while executing the spark job in CDP
I am getting the below error while executing the spark job in Cloudera as I have placed a jar file in hdfs and trying to initiate the sparkjob through the Oozie client and getting the below error.
...
0
votes
0
answers
63
views
Cloudera Enterprise (Community Edition) for RHEL 8
We are running a mini DWH platform with Cloudera Enterprise community version. Underlying Operating system is RHEL7
Version: Cloudera Express 6.0.1 (#610811 built by jenkins on 20181002-0044 git: ...
0
votes
1
answer
225
views
Is CDF feature possible using delta-spark on Cloudera distribution?
We have our application using the on-premise CDP (Cloudera) cluster for submitting pyspark jobs.
Version of spark is 2.x
We are now exploring the option to have CDC datasets processed and merge with ...
0
votes
1
answer
247
views
How to find time difference between two timestamps in seconds and milliseconds in hive and impala
Need a help in finding time difference between two timestamps in seconds and milliseconds in hive and impala. We are using CDP cluster. Two columns are in string datatype with value in the format yyyy-...
0
votes
0
answers
87
views
Hue Pyspark connector using Livy - Increate spark driver memory for interactive sessions
We are using CDP private cloud 7.1.7 and have configured Hue connector for pyspark using livy. By default I can see the driver launches with 1GB memory and I need to increase this as some of the code ...
0
votes
1
answer
330
views
Issue of container OOM when writing Dataframe to parquet files in Spark Job
I'm using Machine Learning Workspace in Cloudera Data Platform (CDP). I created a session with 4vCPU/16 GiB Memory and enabled Spark 3.2.0.
I'm using spark to load data of one month (the whole month ...
0
votes
1
answer
158
views
Connection to remote Hadoop Cluster (CDP) through Linux server
I'm new to PySpark and I want to connect remote Hadoop Cluster (CDP) through Linux server by using spark-submit command.
Any help would be appreciated.
I need spark-submit command to connect remote ...
0
votes
1
answer
213
views
CDP spark cluster mode read hive table, Delegation Token can be issued only with kerberos or web authentication
my env
cdp verison: 7.4.4
spark version:2.4.7.7.1.7.0-551
my java code is this
my submit cmd:
./spark-submit --class com.abc.bdms.sparksql.SparkSQLDriver --master yarn --deploy-mode cluster --executor-...
0
votes
1
answer
240
views
Migration from HDP non-secure cluster to CDP secure cluster
We are running a migration of HDFS data from an HDP non-sercure cluster to CDP secure cluster, when I read the Cloudera documentation, they are mentioning "distcp" as a tool to handle the ...
0
votes
1
answer
74
views
Hive - create table - missing EOF at 'SORT' near ')'
I have this error when i try to execute the query (CREATE) below.
Any suggest?
ERROR: -------------------------------------------------------------------------
[sshexec] 2022-08-22 11:48:36: >> ...
0
votes
1
answer
53
views
Apache NiFi on Cloudera Changing variables from Unauthorized Referencing Components to Referencing Processors
Goal is to move the processors that are using a variable from "Unauthorized Referencing Components" to "Referencing Processors" I've recently moved from HDP to CFM for Apache NiFi ...
0
votes
0
answers
487
views
case insensitive comparison in hive
I have a requirement where I need to do case-insensitive joins across the system and I don't wish to apply upper/lower functions.
I tried setting TBLPROPERTIES('serialization.encoding'='...
0
votes
1
answer
1k
views
Apache Tez tasks on hold at the Application Master
I have a tez problem, when running about 14 queries at the same time, some of them get delays of more than 5 minutes, but the cluster utilization is just 14%.
This is the message that I am talking ...
0
votes
1
answer
396
views
Cloudera CDP Private Cloud - Installation failed on hosts
I get an installation failed on hosts error while usng Cloudera Manager to install CDP 7.1.4 runtime on a trial basis. For this purpose I have spun up two VMs( Ubuntu 18), which use a NatNetwork to ...